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HCV GEWOMIC SEQUENCES FOR 
DIAGNOSTICS AMD THERaPEUTICS 

This application is a continuation-in-part of U.S. 
5 Serial No. 07/697,326 entitled •'Polynucleotide Probes 
Useful for Screening for Hepatitis C Virus, filed Hay 
8, 1991. 

. Technical Field 

10 The invention relates to compositions and methods 

for the detection and treatment of . hepatitis C virus, 
(HCV) infection, formerly referred to as blood-borne 
non-A, non-B hepatitis virus (NANBV) infection. More 
specifically, embodiments of the present invention 

15 feature compositions and methods for the detection of 
HCV, and for the development of vaccines for the 
prophylactic treatment of infections of HCV, and 
development of antibody products for conveying passive 
immunity to HCV, 

20 

Background of the Invention 

The prototype isolate of HCV was characterized in 
U.S. Patent Application Serial No. 122,714 (See also 
EPO Publication No. 318,216). As used herein, the term 
25 "HCV includes new isolates of the same viral species. 
The term "HCV-l" referred to in U.S. Patent Application 
Serial No. 122,714. 
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HCV is a transmissible disease distinguishable 
from other forms of viral-associated liver diseases, 
including that caused by the known hepatitis viruses, 
i.e., hepatitis A virus (HAV), hepatitis B virus <HBV), 
5 and delta hepatitis virus (HDV), as veil as the 
hepatitis induced 1^ cytomegalovirus (CHV) or 
Epstein-Barr virus (EBV) . HCV was first identified in 
individuals who had received blood transfusions. 

The demand for sensitive, specific methods for 
10 screening and identifying carriers of HCV and HCV 
contaminated blood or blood products is significant. 
Post-transfusion hepatitis (PTH> occurs in 
approximately 10% of transfused patients, and HCV 
accoimts for up to 90% of these cases. The disease 
15 frequently progresses to chronic liver damage (25-55%) . 
Patient care as veil as the prevention of 
transmission of HCV by blood and blood products or by 
close personal contact require reliable screening, 
diagnostic and prognostic tools to detect nucleic 
20 acids, antigens and antibodies related to. HCV. 

Information in this application suggests the HCV 
has several genotypes. That is, the genetic 
information of the HCV virus may not be totally 
identical for all HCV, but encompasses groups vith 
25 differing genetic information. 

Genetic information is stored in thread-like 
molecules of OKA and HNA. DHA consists of covalently 

f 
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linked chains of deoxyribonucleotides and SNA consists 
of covalently linked chains of ribonucleotides. Each 
nucleotide is characterized by one of four bases: 
adenine (A), guanine (G), thymine (T), and cytosine 
5 (C) . The bases are complementary in the sense that, 
due to the orientation of functional groups, certain 
base pairs attract and bond to each other through 
hydrogen bonding and ir-stacking interactions. 
Adenine in one strand of DNA pairs with thymine in an 

10 opposing complementary strand. Guanine in one strand 
of DMA pairs with cytosine in an opposing complementary 
strand. In KMA, the thymine base is replaced by uracil 
(U) which pairs with adenine in an closing 
con^lanentary strand. The genetic code of living 

15 organism is carried in the sequence of base pairs. 
Living cells interpret, transcribe and translate the 
information of nucleic acid to make proteins and 
peptides . 

The HCV genome is comprised of a single positive 
20 strand of SNA. The HCV genome possesses a continuous, 
translational open reading frame (ORF) that encodes a 
polyprotein of about 3,000 amino acids. In the ORF, 
the structural proteinCs) appear to be encoded in 
approximately the first quarter of the N-terroinus 
25 region, with the majority of the polyprotein 
responsible for non-structural proteins. 
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Xhe HCV polyprotein comprises, from the amino 
terminus to the carboay terminus, the nucleocapsid 
protein (C), the envelope protein (E), and the 
non-structural proteins (NS) 1, 2 (b), 3, 4 (b), and 5. 
5 HCV of differing genotypes may encode for proteins 

which present an altered response to host immune 
systems. HCV of differing genotypes may be difficult 
to detect by immuno diagnostic techniques and nucleic 
acid probe techniques which are not specifically 
10 directed to such genotype. 

Definitions for selected terms used in the 
application are set forth below to facilitate an 
understanding of the invention. The term 
"corresponding" means homologous to or complementary to 
15 a particular sequence of nucleic acid. As between 
nucleic acids and peptides, corresponding refers to 
amino acids of a peptide in an order derived from the 
sequence of a nucleic acid or its complement. 

The term "non-natural ly occurring nucleic acid" 
20 refers to a portion of genomic nucleic acid, cDNA, 

semisynthetic nucleic acid, or synthetic origin nucleic 
acid which, by virtue of its origin or manipulation: 
(1} is not associated with all of a nucleic acid with 
which it is associated in nature, (2) is linked to a 
25 nucleic acid or other chemical agent other than that to 



SUBSTITUTE SHEgT 



wo 92/19743 



PCr/US92/04036 



- 5 - 



which it is linked in nature, or (3) does not occur in 
nature . 

Similarly the term, "a non-naturally occurring 
peptide" refers to a portion of a large naturally 
5 occurring peptide or protein, or semi-synthetic or 
synthetic peptide, which by virtue of its origin or 
manipulation (1) is not associated with all of a 
peptide with which it is associated in nature, (2) is 
linked to peptides, functional groups or chemical 
10 agents other than that to which it is linked in nature, 
or (3) does not occur in nature. 

The term "primer" refers to a nucleic acid which 
is capable of initiating the synthesis of a larger 
nucleic acid when placed under appropriate conditions. 
15 The primer will be completely or substantially 

complementary to a region of the nucleic acid to be 
copied. Thus, under conditions conducive to 
hybridization, the primer will anneal to a 
complementary region of a larger nucleic acid. Upon 
20 addition of suitable reactants, the primer is extended 
by the polymerizing agent to form a copy of the larger 
nucleic acid. 

The term "binding pair" refers to any pair of 
molecules which exhibit mutual affinity or binding 
15 capacity. For the purposes of the present application, 
the term "ligand" will refer to one molecule of the • 
binding pair, and the term "antiligand" or "receptor" 
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or "target" will refer to the opposite molecule of the 
binding pair. For example, with respect to nucleic 
acids, a binding pair may comprise two complementary 
nucleic acids. One of the nucleic acids may be 
5 designated the ligand and the other strand is 

designated the antiligand receptor or target. The 
designation of ligand or antiligand is a matter of 
arbitrary convenience. Other binding pairs comprise, 
by way of example, antigens and antibodies, drugs and 

10 drug receptor, sites and enzymes and enzyme substrates, 
to name a few. 

The term *'label" refers to a molecular moiety 
capable of detection including, by way of example, 
without limitation, radioactive isotopes, enzymes, 

15 luminescent agents, precipitating agents, and dyes. 

The term "support" includes conventional supports 
such as filters and membranes as well as retrievable 
supports which can be sxibstantially dispersed within a 
medium and removed or separated from the medium by 

20 immobilization, filtering, partitioning, or the like. 
The term "support means" refers to supports capable of 
being associated to nucleic acids, peptides or 
antibodies by binding partners, or covalent or 
noncovalent linkages. 

25 A number of tiCV strains and isolates have been 

identified. fOien compared with the sequence of the 
original isolate derived from the USA ("HCV-l"; see 
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Q.-L. Choo et al. (1989) Science 244:359-362, Q.-L. 
Caxoo et al. (1990) Brit. Med. Bull. 46:423-441, Q.-L. 
Choo et al., Proc. Natl. Acad. Scl. 88:2451-2455 
(1991), and E.P.O. Patent Publication No. 318,216, 
cited supra) . it was found that a Japanese isolate 
("HCV Ji") differed significantly in both nucleotide 
and polypeptide seguence within the NS3 and NS4 
regions. This conclusion was later extended to the NS5 
and envelope (El/S and E2/NS1) regions (see K. Takeuchi 

J. Gen. Virol. (1990) 71:3027-3033, Y. Kubo, 
Nucl. Acids. Res. (1989) 17'' ^0367-10372, and K. 
Takeuchi et al.. Gene (1990) 91:287-291). The former 
group of isolates, originally identified in the United 
states, is termed "Genotype I" throughout the present 
disclosure, while the latter group of isolates, 
initially identified in Japan, is termed "Genotype II" 
herein. 

Brief De scription of the Invention 

The present invention features compositions of 
matter comprising nucleic acids and peptides 
corresponding to the HCV viral genome which define 
different genotypes. The present invention also 
features methods of using the compositions 
corresponding to sequences of the HCV viral genome 
which define different genotypes described herein. 
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A. Nucleic acid compositions 
The nucleic acid of the present invention, 
corresponding to the HCV viral genome which define 
different genotypes, have utility as probes in nucleic 
5 acid hybridization assays, as primers for reactions 
involving the synthesis of nucleic acid, as binding 
partners for separating HCV viral nucleic acid from 
other constituents which may be present, and as 
anti-sense nucleic acid for preventing the 
10 transcription or translation of viral nucleic acid. 

One embodiment of the present invention features a 
composition comprising a non-naturally occurring 
nucleic acid having a nucleic acid sequence of at least 
eight nucleotides corresponding to a non-HC\r-l 
15 nucleotide sequence of the hepatitis C viral genome. 
Preferably, the nucleotide sequence is selected from a 
sequence present in at least one region consisting of 
the NS5 region, envelope 1 region, 5'DT region, and the 
core region. 

20 Preferably, with respect to sequences which 

correspond to the NS5 region, the sequence is selected 
from a sequence within a sequence nximbered 2-22. The 
sequence niimbered 1 corresponds to HCV-1. Sequences 
numbered 1-22 are defined in the Sequence Listing of 

25 the application. 

Preferably, with respect to sequences 
corresponding to the envelope 1 region, the sequence is 
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selected from a sequence within sequences numbered 
24-32. Sequence No. 23 corresponds to HCV-1. 
Sequences numbered 23-32 are set forth in the Sequence 
Listing of the application. 
5 Preferably, with respect to the sequences which 

correspond to the 5'UT regions, the sequence is 
selected from a sequence within sequences numbered 
34-51. Sequence No. 33 corresponds to HCV-i. Sequence 
No. 33-51 are set forth in the Sequence Listing of this 
10 application. 

Preferably, with respect to the sequences which 
correspond to the core region, the sequence is selected 
from a sequence within the sequences numbered 53-66. 
Sequence No. 52 corresponds to HCV-l. Sequences 52-66 
15 are set forth in the Sequence Listing of this 
application. 

The compositions of the present invention form 
hybridization products with nucleic acid corresponding 
to different genotypes of HCV. 

20 HCV has at least five genotypes, which will be 

referred to in this application by the designations 
6I-GV. The first genotype, GI, is exemplified by 
sequences numbered 1-6, 23-25, 33-38 and 52-57. The 
second genotype, GII, is exemplified by the sequences 

25 numbered 7-12, 26-28, 39-45 and 58-64. The third 

genotype, GUI, is exemplified by sequences numbered 
13-17, 32, 46-47 and 65-66. The fourth genotype, GIV, 
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is exemplified by seguences nuinbered 20-22, and 29-31 
and 48-49. The fifth genotype, 67, is exexnplified by 
sequences nuznbered 18, 19, 50 and 51. 

One CTbodiment of the present invention features 
5 coxnpositions coxnprising a nucleic acid having a 

sequence corresponding to one or more sequences which 
exemplify a genotype of HCV. 

B. Method of forming a Hvbridi2ation Product 
10 Embodiments of the present invention also feature 

a method of forming a hybridization product with 
nucleic acid having a sequence corresponding to HCV 
nucleic acid. One method coit^rises the steps of 
placing a non-naturally occurring nucleic acid having a 
15 non-HCV-l sequence corresponding to HCV nucleic acid 
under conditions in which hybridization may occur. The 
non-naturally occurring nucleic acid is capable of 
forming a hybridization product with HCV nucleic acid, 
under hybridization conditions. The method further 
20 comprises the step of imposing hybridization conditions 
to form a hybridization product in the presence of 
nucleic acid corresponding to a region of the HCV 
genome. 

The formation of a hybridization product has 
25 utility for detecting the presence of one or more 
genotypes of HCV. Preferably, the non-naturally 
occurring nucleic acid forms .a hybridization product 
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with nucleic acid of HCV in one or more regions 
comprising the NS5 region, envelope 1 region, 5'UT 
region and the core region. To detect the 
■ hybridization product, it is useful to associate the 
5 non-naturally occurring nucleic acid with a label. The 
formation of the hybridization product is detected by 
separating the hybridization product from labeled 
aon-naturally occurring nucleic acid* which has not 
fom^d a hybridization product. 
10 The formation of a hybridization product has 

utility as a means of separating one or more genotypes 
of HCV nucleic acid from other constituents potentially 
present. For such applications, it is useful to 
associate the non-naturally occurring nucleic acid with 
15 a support for separating the resultant hybridization 
product from the the other constituents. 

Nucleic acid "sandwich assays" employ one nucleic 
acid associated with a label and a second nucleic acid 
associated with a support. An embodiment of the 
20 present invention features a sandwich assay comprising 
two nucleic acids, both have sequences which correspond 
to HCV nucleic acids; however, at least one 
non-naturally occurring nucleic acid has a seguence 
corresponding to non-HCV-i HCV nucleic acid. At least 
25 one nucleic acid is capable of associating with a 

label, and the other is capable of associating with a 
support. The support associated non-naturally 
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occxirring nucleic acid is used to separate the 
hybridization products lAich include an KCV nucleic 
acid and the non-naturally occxirring nucleic acid 
having a non-HCV*l sequence. 
5 One embodiment of the present invention features a 

method of detecting one or more genotypes of HCV. The 
method comprises the steps of placing a non-naturally 
occurring nucleic acid under conditions which 
hybridization may occur. The non-naturally occurring 
10 nucleic acid is capable of forming a hybridization 

product with nucleic acid from one or more genotypes of 
HCV. The first genotype ^ GI, is exemplified by 
seguences numbered 1-6^ 23-25, 33-38 and 52-57. The 
second genotype, Gil, is exemplified by the sequences 
15 numbered 7-12, 26-28, 39-45 and 58-64. The third 

genotype, GUI, is exemplified by sequences numbered 
13-17, 32, 46-47 and 65-66. The fourth genotype, 6IV, 
is exexnplified sequences numbered 20-22 and 29-31. The 
fifth genotype, 6V, is exemplified by sequences 
20 numbered 18, 19, 50 and 51. 

The hybridization product of HCV nucleic acid with 
a non-naturally occurring nucleic acid having non-HCV-l 
sequence corresponding to sequences within the HCV 
genome has utility for priming a reaction for the 
25 synthesis of nucleic acid. 

The hybridization product of HCV nucleic acid with 
a non-naturally occurring nucleic acid having a 
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sequence corresponding to a particular genotype of HCV 
has utility for priming a reaction for the synthesis of 
nucleic acid of such genotype. In one embodiment, the 
synthesized nucleic acid is indicative of the presence 
5 of one or more genotypes of HCV . 

The synthesis of nucleic acid may also facilitate 
cloning of the nucleic acid into expression vectors 
which synthesize viral proteins. 

Embodiments of the present methods have utility as 
10 anti-sense agents for preventing the transcription or 
translation of viral nucleic acid. The formation of a 
hybridization product of a non-naturally occurring 
nucleic acid having sequences which correspond to a 
particular genotype of HCV genomic sequencing with HCV 
15 nucleic acid may block translation or transcription of 
such genotype. Therapeutic agents can be engineered to 
include all five genotypes for inclusivity.- 
C' Peptide and antibody composi-eion 
A further embodiment of the present invention 
20 features a composition of matter comprising a 

non-naturally occurring peptide of three or more amino 
acids corresponding to a nucleic acid having a 
non-HCV-1 sequence. Preferably, the non-HCV-i sequence 
corresponds with a sequence within one or more regions 
25 consisting of the NSS" region, the envelope 1 region, 
the 5*UT region, and* the core region. 



1 
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Preferably,, with respect to peptides corresponding 
to a nucleic acid having a non-HCV-l sequence of the 
HS5 region, the sequence is within sequences numbered 
2-22. The sequence numbered 1 corresponds to HCV-1, 
5 Sequences numbered 1-22 are set forth in the Sequence 
Listing. 

Preferably, with respect to peptides corresponding 
to a nucleic acid having a non-HCV-i sequence of the 
envelope 1 region, the sequence is within sequences 

10 numbered 24-32. The sequence numbered 23 corresponds 
to HCV-1. Sequences numbered 23-32 are set forth in 
the Sequence Listing. 

Preferably, with respect to peptides corresponding 
to a nucleic acid having a non-HCV-1 sequence directed 

15 to the core region, the sequence is within sequences 
niunbered 53-66. Sequence numbered 52 corresponds to 
HCV-1. Sequences numbered 52-66 are set forth in the 
Sequence Listing. 

The further embodiment of the present invention 

20 features peptide coznpositions corresponding to nucleic 
acid sequences of a genotype of HCV. The first 
genol^e, 61, is exemplified by sequences niambered 1-6, 
23-25, 33-38 and 52-57. The second genotype, GII, is 
exemplified by the sequences numbered 7-12, 26-28, 

25 39-45 and 58-64. The third genotype, GUI, is 

exemplified by sequences numbered 13-17, 32, 46-47 and 
65-66. The fourth genotype, GIV, is exemplified 
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segueaees mmbered 20-22, 29-31, 48 and 49. The fifth 
genotype, OV, is exemplified by seguences numbered 18, 
19, 50 and 51. 

The non-natural ly occurring peptides of the 
5 present invention are useful as a component of a 
vaccine. The sequence information of the present 
invention permits the design of vaccines which are 
inclusive for all or some of the different genotypes of 
HCV. Directing a vaccine to a particular genotype 
10 allows prophylactic treatment to be tailored to 

maximise the protection to those agents likely to be 
encountered. Directing a vaccine to more than one 
genotype allows the vaccine to be more inclusive. 

The peptide compositions are also useful for the 
15 development of specific antibodies to the HCV 

proteins. One embodiment of the present invention 
features as a composition of matter, an antibody to 
peptides corresponding to a non-HCV-i sequence of the 
HCV genome. Preferably, the non-HCV-i sequence is 
20 selected from the sequence within a region consisting 
of the NS5 region, the envelope 1 region, and the core 
region. There are no peptides associated with the 
untranslated iS'UT region. 

Preferably, with respect to antibodies directed to 
15 peptides of the NS5 region, the peptide corresponds to 
a sequence within seguences numbered 2-22. Preferably, 
with respect to antibodies directed to a peptide 
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corresponding to the envelope 1 region, the peptide 
corresponds to a sequence within sequences nuznbered 
24-32. Preferably, vith respect to the antibodies 
directed to peptides corresponding to the core region, 
5 the peptide corresponds to a sequence within sequences 
nuznbered 53-66. 

Antibodies directed to peptides ^ich reflect a 
particular genotype have utility for the detection of 
such genotypes of HCV and therapeutic agents.. 

10 One ensbodinient of the present invention features 

an antibody directed to a peptide corresponding to 
nucleic acid having sequences of a particular 
genotype. The first genotype, 61, is exemplified by 
sequences nimbered 1-6, 23-25, 33-38 and 52-57. The 

15 second genotype, 611, is exemplified by the sequences 
numbered 7-12, 26-28, 39-45 and 58-64. The third 
genotype, 6111, is exemplified by sequences numbered 
13-17r 32, 46-47 and 65-66. The fourth genotype, 6IV, 
is exemplified sequences numbered 20-22, 29-31, 48 and 

20 49. The fifth genotype, 6V, is exemplified by 
sequences numbered 18, 19, 50 and 51. 

Individuals skilled in the art will readily 
recognize that the compositions of the present 
invention can be packaged with instructions for use in 

25 the form of a kit for performing nucleic acid 
hybridisations or immxmochemical reactions. 
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The present invention is further described in the 
following figures which illustrate sequences 
demonstrating genotypes of HCV. The sequences are 
designated by numerals 1-145, which numerals and 
5 sequences are consistent with the numerals and 

sequences set forth in the Sequence Listing. Sequences 
146 and 147 facilitate the discussion of an assay which 
numerals and sequences are consistent with the numerals 
and sequences set forth in- the Sequence Listing. 

10 

Brief Descr iption of the Figures and Sequence Listing 

Figure 1 depicts schematically the genetic 
organization of HCV; 

Figure 2 sets forth nucleic acid sequences 
15 nxanbered 1-22 which seq[uences are derived from the NS5 
region of the HCV viral genome; 

Figure 3 sets forth nucleic acid sequences 
numbered 23-32 which sequences are derived from the 
envelope 1 region of the HCV viral genome; 
20 Figure 4 sets forth nucleic acid sequences 

niimbered 33-51 which sequences are derived from the 
5'UT region of the HCV viral genome; and. 

Figure 5 sets forth nucleic acid sequences 
numbered 52-66 which sequences are derived from the 
25 core region of the HCV viral genome. 

The Sequence Listing sets forth the sequences of 
sequences ntimbered 1-147. 
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Detailed Description of the Invention 

The present invention will be described in detail 
as as nucleic acid having sequences corresponding to 
the HC7 genome and related peptides and binding 
5 partners, for diagnostic and therapeutic applications. 

The practice of the present invention will enrploy, 
unless otherwise indicated, conventional techniques of 
chemistry r molecular biology, microbiology, recombinant 
DNA, and immunology, which are within the skill of the 
10 art. Such techniques are escplained fully in the 
literature. See e.g., Haniatis, Fitsch & Sambrook, 
Molecular Cloning; A Laboratory Manual (1982); DNA 
Cloning, Volumes I and II (D.N Glover ed. 1985); 
Oligonucleotide Synthesis (M.J. Gait ed, 1984); Hucleic 
15 Acid Hybridization (B.D. Hames & S.J, Higgins eds. 
1984); the series. Methods in Enzymology (Academic 
Press, Inc.), particularly Vol. 154 and Vol. 155 (Wu 
and Grossman, eds.}. 

The cDNA libraries are derived from nucleic acid 
20 sequences present in the plasma of an HCV-infected 
chimpanzee. The construction of one of these 
libraries, the "c" library (ATCC No. 40394), is 
described in PCT Pub.. No. WO90/14436. The sequences of 
the library relevant to the present invention are set 
25 forth herein as sequence nimtbers 1, 23, 33 and 52. 
Nucleic acids isolated or synthesized in 
accordance with features of the present invention are 
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useful, by way of example without limitation as probes, 
primers, anti-sense genes and for developing expression 
systems for the synthesis of peptides corresponding to 
such sequences. 

The nucleic acid sequences described define 
genotypes of HCV with respect to four regions of the 
viral genome. Figure 1 depicts schematically the 
organization of HCV. The four regions of particular 
interest are the HE5 region, the envelope l region, the 
S'TJT region and the core region. 

_ The sequences set forth in the present application 
as sequences numbered 1-22 suggest at least five 
genotypes in the NS5 region. Sequences numbered 1-22 
are depicted in Figure 2 as well as the Sequence 
15 Listing. Each sequence numbered 1-22 is derived from 
nucleic acid having 340 nucleotides from the NS5 region. 

The five genotypes are defined by groupings of the 
sequences defined by sequence numbered 1-22. For 
convenience, in the present application, the different 
20 genotypes will be assigned roman numerals and the 
letter "G". 

The first genotype (61) is exemplified by 
sequences within sequences numbered 1-6. A second 
genotype (Oil) is exemplified by sequences within 
25 sequences numbered 7-12. A third genotype (GUI) is 
exemplified by the sequences within sequences numbered 
13-17. A fourth genotype (Giv) is exemplified by 
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seguezxces within eeguexices niimbered 20-22. A fifth 
genotype (GV) is exemplified by sequences within 
segiiences numbered 18 and 19. 

The sequences set forth in the present application 
5 as sequences numbered 23-32 suggest at least four 
genotypes in the envelope 1 region of HC^. Sequences 
numbered 23-32 are depicted in Figure 3 as well as in 
the Sequence Listing. Each sequence numbered 23-32 is 
' - derived from nucleic acid having 100 nucleotides from 
10 the envelope 1 region. 

A first envelope i genotype group (61) is 
exemplified by the sequences within the sequences 
numbered 23-25* A second envelope 1 genotype (611) 
region is exentplif ied by sequences within sequences 
15 numbered 26-28 . A third envelope 1 genotype (GUI) is 
exemplified by the sequences within sequences numbered 
32. A fourth envelope 1 genotype (6IV) is exemplified 
by the sequences within sequence numbered 29-31. 

The sequences set forth in the present application 
20 as sequences numbered 33-51 suggest at least three 
genotypes in the 5'UT region of HCV. Sequences 
numbered 33-51 are depicted in Figure 4 as well as in . 
the Sequence Listing. Each sequence nxambered 33-51 is 
derived from the nucleic acid having 252 nucleotides 
25 from the 5'UT region, .although sequences 50 and 51 are 
somewhat shorter at approximately 180 nucleotides. 
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The first 5'UT genotype (61) is exemplified by the 
sequences within sequences numbered 33-38. A second 
5'DT genotype (GII) is exemplified by the sequences 
within sequences numbered 39-45. A third 5'XJT genotype 
(GUI) is exemplified by the sequences within sequences 
numbered 46-47. A fourth 5'UT genotype (GIV) is 
exemplified by sequences • within sequences hunbered 48 
and 49. A fifth 5'UT genotype (6V) is exemplified by 
sequences within sequences numbered 50 and 51, 

The sequences numbered 48-62 suggest at least 
three genotypes in the core region of HCV. The 
sequences numbered 52-66 are depicted in Figure 5 as 
well as in the Sequence Listing. 

The first core region genotype (GI) is exemplified 
by the sequences within sequences numbered 52-57. The 
second core region genotype (GII) is exemplified by 
sequences within, sequences numbered 58-64. The third 
core region genotype (GUI) is exemplified by sequences 
within sequences numbered 65 and 66. Sequences 
numbered 52-65 are comprised of 549 nucleotides. 
Sequence numbered 66 is comprised of 510 nucleotides. 

The various genotypes described with respect to 
each region are consistent. That is, HCV having 
features of the first genotype with respect to the NS5 
region will substantially conform to features of the 
first genotype of the envelope l region, the 5'UT 
region and the core region. 
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Nucleic acid isolated or synthesized in accordance 
vith the sequences set forth in sequence nirabers 1-66 
are useful as probes, primers, capture ligands and 
anti-sense agents. As probes, primers, capture ligands 
5 and anti-sense agents, the nucleic acid vil normally 
comprise approximately eight or more nucleotides for 
specificity as well as the ability to form stable 
hybridization products. 

10 Probes 

A nucleic acid isolated or synthesized in 
accordance vith a sequence defining a particular 
genotype of a region of the HCV genome can be used as a 
probe to detect such genotype or used in combination 
15 vith other nucleic acid probes to detect substantially 
all genotypes of HCV. 

With the sequence information set forth in the 
present application, sequences of eight or more 
nucleotides are identified which provide the desired 
20 inclusivity and exclusivity with respect to various 
genotypes within HCV, and extraneous nucleic acid 
sequences likely to be encountered during hybridization 
conditions . 

Individuals skilled in the art will readily 
25 recognize that the nucleic acid sequences, for use as 
probes, can be provided vith a label to facilitate 
detection of a hybridization product. 
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Capture Liqand 

For use as a capture ligand, the nucleic acid 
selected in the manner described above with respect to 
probes, can be readily associated with supports. The 
5 manner in which nucleic acid is associated with 

supports is well known. Nucleic acid having sequences 
corresponding to a sequence within sequences numbered 
1-66 have utility to separate viral nucleic acid of one 
genotype from the nucleic acid of HCV of a different 
10 genotype. NUcleic acid isolated or synthesized in 
accordance with sequences within sequences numbered 
1-66, used in combinations, have utility to capture 
substantially all nucleic acid of all HCV genotypes. 

Primers 

Nucleic acid isolated or synthesized in accordance 
with the sequences described herein have utility as 
primers for the amplification of HCV sequences. With 
respect to polymerase chain reaction (PGR) techniques, 
nucleic acid sequences of eight or more nucleotides 
corresponding to one or more sequences of sequences 
numbered 1-66 have utility in conjunction with suitable 
enzymes and reagents to create copies of the viral 
nucleic acid. A plurality of primers having different 
sequences corresponding to more than one genotype can 
be used to create copies of viral nucleic acid for such 
genotypes . 
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The copies can be used in diagnostic assays to 
detect HCV virus. The copies can also be incorporated 
into cloning and expression vectors to generate 
polypeptides corresponding to the nucleic acid 
synthesized by FCR, as will be described in greater 
detail below. 

Anti-sense 

Nucleic acid isolated or synthesized in accordance 
with the^ sequences described herein have utility as 
anti-sense genes to prevent the expression of HCV. 

Nucleic acid corresponding to a genotype of HCV is 
loaded into a suitable carrier such as a liposome for 
introduction into a cell infected with HCV. A nucleic 
acid having eight or more nucleotides is capable of 
binding to viral nucleic acid or viral messenger RNA. 
Preferably, the anti-sense nucleic acid is comprised of 
30 or more nucleotides to provide necessary stability 
of a hybridization product of viral nucleic acid or 
viral messenger RNA. Methods for loading anti-sense 
nucleic acid is known in the art as exemplified by U.S. 
Patent 4,241,046 issued December 23, 1980 to 
Papahadjopoulos et al. 

Peptide Synthesis 

Nucleic acid isolated or synthesized in accordance 
with the sequences described herein have utility to 
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generate p^tides. The sequences exemplified by 
sequences numbered 1-32 and 52-66 can be cloned into 
suitable vectors or used to isolate nucleic acid. The 
isolated nucleic acid is combined with suitable DNA 
linkers and cloned into a suitable vector. The vector 
can be used to transform a suitable host organism such 
as coli and the peptide encoded by the sequences 
isolated. 

Molecular cloning techniques are described in the 
text Molecular Cloning; A Laboratory Manual . Maniatis 
et al., Coldspring Harbor Laboratory (1982). 

The isolated peptide has utility as an antigenic 
substance for the development of vaccines and 
antibodies directed to the particular genotype of HCV. 

Vaccines and Antibodies 

The peptide materials of the present invention 
have utility for the development of antibodies and 
vaccines. 

The availability of cDNA sequences, or nucleotide 
sequences derived therefrom (including segments and 
modifications of the sequence) , permits the 
construction of expression vectors encoding 
antigenically active regions of the peptide encoded in 
either strand. The antigenically active regions may be 
derived from the 1IS5 region, envelope X regions, and 
the core region. 
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Fragments encoding the desired peptides are 
derived from the cBNA clones using conventional 
restriction digestion or by synthetic methods/ and are 
ligated into vectors which may, for exan^le, contain 
5 portions of fusion sequences such as beta galactosidase 
or superoxide dismutase (80D}, preferably 60S. Methods 
and vectors which are useful for the production of 
polypeptides which contain fusion sequences of SOD are 
described in European Patent Office Publication nxmber 
10 0196056/ published October 1, 1986. 

Any desired portion of the HCV cDNA containing an 
open reading frame, in either sense strand/ can be 
obtained as a recombinant peptide / such as a mature or 
fusion protein; alternatively/ a peptide encoded in the 
15 cDHA can be provided by chemical synthesis. 

The DNA encoding the desired peptide, whether in 
fused or mature form/ and whether or not containing a 
signal sequence to permit secretion, may be ligated 
into egression vectors suitable for any convenient 
20 host. Both eukaryotic and prokaryotic host systems are 
presently used in forming recombinant peptides. The 
peptide is then isolated from lysed cells or from the 
culture medium and purified to the extent needed for 
its intended use. Piirif ication may be by techniques 
25 known in the art, for example, differential extraction, 
salt fractionation, chromatography on ion exchange 
resins, affinity chromatography, centrifugation, and 
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the like. See, for example. Methods in Enzymology for 
a variety of methods for purifying proteins. Such 
peptides can be used as diagnostics, or those which 
give rise to neutralizing antibodies may be formulated 
5 into vaccines. Antibodies raised against these 
peptides can also be used as diagnostics, or for 
passive immunotherapy or for isolating and identifying 
HCV. 

An antigenic region of a peptide is generally 
10 relatively small—typically 8 to 10 amino acids or less 
in length. Fragments of as few as 5 amino acids may 
characterize an antigenic region. These segments may 
correspond to NS5 region, envelope 1 region, and the 
core region of the HCV genome. The 5'UT region is not 

15 known to be translated. Accordingly, using the cDNAs 
of such regions, DMAs encoding short segments of HCV 
peptides corresponding to such regions can be expressed 
recombinantly either as fusion proteins, or as isolated 
peptides. In addition, short amino acid sequences can 

20 be conveniently obtained by chemical synthesis. In 
instances wherein the synthesized peptide is correctly 
configured so as to provide the correct epitope, but is 
too small to be immunogenic, the peptide may be linked 
to a suitable carrier. 

25 A number of techniques for obtaining such linkage 

are known in the art, including the formation of 
disulfide linkages using N-succinimidyl-3-(2- 
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pyridyltliio)propxo&ate (SPDP) and succinimidyl 
.4-(N-2naleiiDido-inethyl)cyclohexane-l-carboxylate (SMCC) 
obtained from Pierce Company, Rockford, Illinois r (if 
the peptide lacks a sulfhydryl group, this can be 
5 provided by addition of a cysteine residue) . These 
reagents create a disulfide linkage between themselves 
and peptide cysteine residues on one protein and an 
amide linkage through the epsilon-amino on a lysine, or 
other free amino group in the other. A variety of such 
10 disulfide/amide-forming agents are knovn. See, for 
example^ Immun Rev (1982) 62:185. Other bifunctional 
coupling agents form a thioether rather than a 
disulfide linkage. Many of these thio-ether-forming 
agents are commercially available and include reactive 
15 esters of 6-maleimidocaprioc acid, 2-bromoacetic acid, 
2-iodoacetic acid, 4-K-roaleimido-methyl)cyclohexane-l- 
carbo^^lic acid, and the like. The carboxyl groups can 
be activated by combining them with succinimide or 
l-hydro:qrl-2 nitro-4-sulfonic acid, sodiiam salt. 
20 Additional methods of coupling antigens employs the 
rotavirus/"binding peptide" system described in EPO 
Pub. No. 259,149, the disclosure of which is 
incorporated herein by reference. The foregoing list 
is not meant to be e:diaustive, and modifications of the 
25 named compoimds can clearly be used. 

Any carrier may be used which does not itself 
induce the production of antibodies harmful to the 
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host. Suitable carriers are typically large, slowly 
metabolized macromolecules such as proteins; 
polysaccharides, such as latex functionalized 
fiepharose, agarose, cellulose, cellulose beads and the 
5 like; polymeric amino acids, such as polyglutamic acid, 
polylysine, and the like; amino acid copolymers; and 
inactive virus particles. Especially useful protein 
substrates are serum albumins, keyhole limpet, 
hemocyanin, immunoglobulin molecules, thyroglobulin, 
10 ovalbumin, tetanus toxoid, and other proteins well 
known to those skilled in the art. 

Peptides comprising HCV amino acid sequences 
encoding at least one viral epitope derived from the 
NS5, envelope l, and core region are useful 
15 immunological reagents. The 5 'UT region is not known 
to be translated. For example, peptides comprising 
such truncated sequences can be used as reagents in an 
immunoassay. These peptides also are candidate subunit 
antigens in compositions for antiserum production or 
20 vaccines. While the truncated sequences can be 

produced by various known treatments- of native viral 
protein, it is generally preferred to make synthetic or 
recombinant peptides comprising HCV sequence. Peptides 
comprising these truncated HCV sequences can be made up 
25 entirely of HCV sequences (one or more epitopes, either 
contiguous or noncontiguous), or HCV sequences and 
heterologous sequences in a fusion protein. Useful 
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heterologous sequences include seguences that provide 
for secretion from a recombinant host/ enhance the 
immunological reactivity of the HCV epitope (s)/ or 
facilitate the coupling of the polypeptide to an 
5 immunoassay support or a vaccine carrier. See, E.G., 
EPO Pub* Ho. 116,201; U.6. Pat. No. 4,722,840; EPO Pub. 
No. 259,149; U.S. Pat. No. 4,629,783. 

The size of peptides comprising the truncated HCV 
seguences can vary widely, the minimum size being a 

10 sequence of sufficient size to provide an HCV epitope, 
while the maximum size is not critical. For 
convenience, the maximum size usually is not 
substantially greater than that required to provide the 
desired HCV epitopes and fxmctionCs) of the 

15 heterologous sequence, if any. Typically, the 

truncated HCV amino acid sequence will range from about 
5 to about 100 amino acids in length. More typically, 
however, the HCV sequence will be a maximum of about 50 
amino acids in length, preferably a maximum of about 30 

20 amino acids. It is usually desirable to select HCV 
sequences of at least about 10, 12 or 15 amino acids, 
up to a maximum of about 20 or 25 amino acids. 

HCV amino acid sequences comprising epitopes can 
be identified in a number of ways. For example, the 

25 entire protein sequence corresponding to each of the 
NS5, envelope 1, and core regions can be screened by 
preparing a series of short peptides that together span 
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the entire protein sequence of such regions. By 
starting with, for example, peptides of approximately 
100 amino acids, it would be routine to test each 
peptide for the presence of epitope(s) showing a 
5 desired reactivity, and then testing progressively 
smaller and overlapping fragments from an identified 
peptides of lOO amino acids to map the epitope of 
interest. Screening such peptides in an immunoassay is 
within the skill of the art. It is also known to carry 
10 out a computer analysis of a protein sequence to 

identify potential epitopes, and then prepare peptides 
comprising the identified regions for screening. 

The immunogenicity of the epitopes of HCV may also 
be enhanced by preparing them in mammalian or yeast 
15 systems fused with or assembled with particle-forming 
proteins such as, for example, that associated with 
hepatitis B surface antigen. See, e.g. . us 4,722,840. 
Constructs wherein the HCV epitope is linked directly 
to the particle-forming protein coding sequences 
20 produce hybrids v*ich are immunogenic with respect to 
the HCV epitope. In addition, all of the vectors 
prepared include epitopes specific to HBV, having 
various degrees of immunogenicity, such as, for 
example , the pre-S peptide . Thus , particles 
25 constructed from particle forming protein which include 
HCV sequences are immunogenic with respect to HCV»and 
HBV. 
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Hepatitis surface antigen (HBSAg) has been shovn 
to be formed and assembled into particles in B\ 
cerevlsiae (P. Valenzuela et al. (1982)), as veil as 
in, for example, mammalian cells (P. Valenzuela et al. 
5 1984}}. The formation of such particles has been shown 
to enhance the immunogenicity of the monomer sxabunit. 
The constructs may also include the immimodominant 
epitope of HBSAg, comprising the 55 amino acids of the 
presurface (pre-S) region. Neurath et al. (1984). 
10 Constructs of the pre-S-HBSAg particle expressible in 
yeast are disclosed in EPO 174,444, published March 19, 
1986; hybrids including heterologous viral sequences 
for yeast expression are disclosed in EPO 175,261, 
published March 26, 1966. These constructs may also be 
15 expressed in mammalian cells such as Chinese hamster 

ovary (CHO) cells using an SV40-dihydrofolate reductase 
vector (Michelle et al. (1984}}. 

In addition, portions of the particle-forming 
protein coding sequence may be replaced with codons 
20 encoding an HCV epitope. In this replacement, regions 
which are not required to mediate the aggregation of 
the units to form immunogenic particles in yeast of 
mammals can be deleted, thus eliminating additional HBV 
antigenic sites from competition with the HCV epitope. 

25 

Vaccines 

Vaccines may be prepared from one or more 
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innnunogenic peptides derived from HCV. The observed 
homology between HCV and Flaviviruses provides 
information concerning the peptides which are likely to 
be most effective as vaccines, as well as the regions 
5 of the genome in which they are encoded. 

Multivalent vaccines against HCV may be comprised 
of one or more epitopes from one or more proteins 
derived from the NS5, envelope 1, and core regions. In 
particular, vaccines are contemplated comprising one or 
10 more HCV proteins or subunit antigens derived from the 
MSS, envelope 1, and core regions. The 5'UT region is 
not known to be translated. 

The preparation of vaccines which contain an 
immunogenic peptide as an active ingredient, is known 
X5 to one skilled in the art. Typically, such vaccines 

are prepared as in j eatables, either as liquid solutions 
or suspensions; solid forms suitable for solution in, 
or suspension in, liquid prior to injection may also be 
prepared. The preparation may also be emulsified, or 
20 the protein encapsulated in liposomes. The active 

immunogenic ingredients are often mixed with excipients 
which are pharmaceutically acceptable and compatible ' 
with the active ingredient. Suitable excipients are, 
for example, water, saline, dextrose, glycerol, 
25 ethanol, or the like and combinations thereof, in 
addition, if desired, the vaccine may contain minor 
amounts of auxiliary substances such as wetting or 
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emulsifying agents, pH buffering agents, and/or 
adjuvants %i^ich enhance the effectiveness of the 
vaccine* Exan^les of adjuvants which xnay be effective 
include but are not limited to: aluminum hydroxide, 
5 K-acetyl-murarayl-L-theronyl-D- isoglutamine (thr-MDP), 
H-ace^l-nor-nuramyl-L-alanyl- D-isoglut amine (C6P 
11637, referred to as nor-MDP)^ N- acetylmurainyl-L- 
alanyl-D-isoglutaffiinyl-L-alanine-2-( 1- 2-dipalmitoyl 
-Bn-glycero-3^hydroxyphosphoryloxy)- ethylamine (CGP 
10 19835A, referred to as MTP-PE), and RIBI, which 
contains three components extracted from bacteria, 
monophosphoryl lipid A, trehalose dimycolate and cell 
wall skeleton (KPL+TDM+CWS) in a 2% sgualene/Tween 80 
emulsion. The effectiveness of an adjuvant may be 

15 determined by measuring the amount of antibodies 

directed against an immunogenic peptide containing an 
HCV antigenic seguence resulting from administration of 
this peptide in vaccines which are also coznprised of 
the various adjuvants. 

20 The vaccines are conventionally administered 

parenterally, by injection, for example, either 
subcutaneous ly or intramuscularly. Additional 
formulations which are suitable for other modes of 
administration include suppositories and, in some 

25 cases, oral formulations. For suppositories, 

traditional binders and carriers may include, for 
example, polyalkylene glycols or triglycerides; such 
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suppositories may be formed from mixtures containing 
the active ingredient in the range of 0/5% to 10%, 
preferably l%-2%. Oral formulations include such 
normally employed excipients as, for example, 
5 pharmaceutical grades of mannitol, lactose, starch, 
magnesium stearate, sodium saccharine, cellulose, 
magnesium carbonate, and the like. 

The examples below are provided for illustrative 
purposes and are not intended to limit the scope of the 
10 present invention. 

!• Detection of HCV RNA from Serum 

RNA was extracted from serum using guanidiniian 
salt, phenol and chloroform according to the 

15 instructions of the kit manufacturer (RNAzol"* B kit, 
Cinna/Biotecx) . Extracted RNA was precipitated with 
isopropanol and washed with ethanol. A total of 25 
»il serum was processed for RNA isolation, and the 
purified SNA was resuspended in 5 pi diethyl 

20 pyrocarbonate treated water for subsequent cDNA 
synthesis. 

CDMA Synthesis and P olymerase Chain Reaction (PCR) 
Amplification 

25 Table 1 lists the sequence and position (with 

reference to HCVl) of all the PCR- primers and probes 
used in these examples. Letter designations for 



SUBSTITUTE SHEfiT 



wo 92/19743 



PCr/l)S92/04036 



- 36 - 



nucleotides are consistent with 37 C.F.R. 5S1.821- 
1.825. Thus, the letters A, C, 6, T, and U are used in 
the ordinary sense of adenine, cytosine, guanine, 
thymine, and uracil. The letter K means A or C; R 

5 means A or 6; W means A or T/U; S means C or G; Y means 
C or T/U; K means G or T/U; V means A or C or G, not 
T/U; H means A or C or T/U, not G; D means A or G or 
T/U, not C; B means C or G or T/U, not A; N means (A or 
C or G or T/U) or (unknown or other). Table 1 is set 

10 forth below: 

Table 1 

Se5. NO. sequence (5'-3') NucleotidePosition 

67 CAAACGTAACACCAACCGRCGCCCACA6G 374-402 

15 68 ACAGAYCCGCAKAGRTCCCCCACG 1192-1169 

69 GCAACCTCGAGGTAGACGTCAGCCTATCCC 509-538 

70 GCAACCTCGTGGAAG6CGACAACCTATCCC 509-538 

71 6TCACCAATGATTGCCCTAACTCGAGTATT 948-977 

72 GTCACGAACGACTGCTCCAACTCAAG 948-973 
20 73 TGGACATGATCGCTGGWGCYCACTGGGG 1375-1402 

74 TGGAYATGGTGGYGGGGGCYCACTGGGG 1375-1402 

75 ATGATGAACTGGTCVCCXAC 1308-1327 

76 ACCTTV6CCCAGTTSCCCRCCATG6A 1453-1428 

77 AACCCACTCTATGYCCGGyCAT 205-226 
25 78 GAATCGCTGGGGTGACCG 171-188 

79 CCATGAATCACTCCCCTGTGAGGAACTA 30-57 

80 TTGCGGGGGCACGCCCAA 244-227 
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For cDNA synthesis and PGR amplification, a 
protocol developed by Perkin-Elmer/Cetus (OeneAnrp® 
iOIA PGR kit) was used. Both random hexamer and primers 
with specific complementary sequences to HCV were 

5 employed to prime the reverse transcription (RT) 

reaction. All processes, except for adding and mixing 
reaction components, were performed in a thermal cycler 
(MJ Research, Inc.). The first strand cDNA synthesis 
reaction was inactivated at 99'C for 5 min, and then 

10 cooled at 50 for 5 min before adding reaction 
components for subsequent amplification. After an 
initial 5 cycles of 97»C for 1 min, 50»C for 2 min, and 
72»C for 3 min, 30 cycles of 94«C for 1 min, 55«C for 2 
min, and 72''C for 3 min followed, and then a final 7 

15 min of elongation at 72»C. 

For the genotyping analysis, sequences 67 and 68 
were used as primers in the PGR reaction. These 
primers amplify a segment corresponding to the core and 
envelope regions. After amplification, the reaction 

20 products were separated on an agarose gel and then 
transferred to a nylon membrane. The inmobilized 
reaction products were allowed to hybridize with a 
^^P-labelled nucleic acid corresponding to either 
Genotype I (core or envelope 1) or Genotype II (core or 

25 envelope I). Nucleic acid corresponding to Genotype 1 
comprised sequences numbered 69 (core), 71 (envelope), 
and 73 (envelope). Nucleic acid corresponding to 
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Genotype II comprised sequences ntunbered 70 (core)/ 72 

(envelope) r and 74 (envelope). 

The Genotype I probes only hybridized to the 

product amplified from isolates which had Genotype I 
5 sequence. Similarly, Genotype II probes only 

hybridized to the product amplified from isolates which 

had Genotype II sequence. 

In another escperiment, PGR products were generated 

using sequences 79 and 80. The products were analyzed 
10 as described above except Sequence No. 73 was used to 

detect Genotype 1, Sequence No. 74 was used to detect 

Genotype II, Sequence No. 77 (5*UT) was used to detect 

Genotype III, and Sequence No. 78 (5'UT) was used to 

detect Genotype IV. Each sequence hybridized in a 
15 genotype specific maimer. 

III. Detection of HCV GI-GIV using a sandwich 
hybridization assay for HCV RNR 
An amplified solution phase nucleic acid sandwich 
20 hybridization assay format is described in this 

example. The assay format employs several nucleic acid 
probes to effect capture and detection. A capture 
probe nucleic acid is capable of associating a 
complementary probe bound to a solid support and HCV 
25 nucleic acid to effect capture. A detection probe 

nucleic acid has a first segment (A) that binds to HCV 
nucleic acid and a second segment (B) that hybridizes 
to a second amplifier nucleic acid. 
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The amplifier' nucleic acid has a first segment (B*) 
that hybridizes to segment (B) of the probe nucleic 
acid and also comprises fifteen iterations of a segment 
(C). Segment C of the amplifier nucleic acid is 
capable of hybridizing to three labeled nucleic acids. 

Nucleic acid sequences which correspond to 
nucleotide sequences of the envelope 1 gene of Group I 
HCV isolates are set forth in sequences numbered 
81-99. Table 2 sets forth the area of • the HCV genome 
to which the nucleic acid sequences correspond and a 
preferred use of the sequences. 



15 



Probe Type 



Table 2 
Sequence No. 



Complement of 
Nucleotide Numbers 



20 



25 



Label 

Label 

Capture 

Label 

Label 

Label 

Label 

Capture 

Label 



81 
82 
83 
84 
85 
86 
87 
88 
89 



879- 
912- 
945- 
978- 
1011- 
1044- 
1077- 
1110- 
1143- 



-911 

-944 

-977 

-1010 

1043 

1076 

1109 

1142 

1175 
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Table 2 continued 

Probe Type Sequence No. complement of 

Nucleotide Numbers 



1176-1208 
1209-1241 
1242=1274 
1275-1307 
1308-1340 
1341-1373 
1374-1406 
1407-1439 
1440-1472 
1473-1505 

Nucleic acid sequences which correspond to 
nucleotide sequences of the envelope 1 gene of Group II 
HC7 isolates are set forth in sequences 100-118. Table 
20 3 sets forth the area of the HCV genome to which the 
nucleic acid corresponds and the preferred use of the 
sequences . 
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Label 


91 


Label 


92 


Capture 


93 


Label 


94 


Label 


95 


Label 


96 


Label 


97 


Capture 


98 


Label 


99 
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Table 3 

Probe Type Sequence Ho. Complement of 

Nucleotide Mtimbers 

5 =~=«™==~-=~»==-=™«— „„„„.»»^„™„ 



20 





100 


879-911 


LEDel 


101 


912^944 


Capture 


102 


945-977 


Laoel 


103 


978-1010 


Label 


104 


1011-1043 


Label 


105 


1044-1076 


Label 


106 


1077-1109 


Capture 


107 


1110-1142 


Label 


108 


1143-1175 


Label 


109 


1176-1208 


Label 


110 


1209-1241 


Label 


111 


1242-1274 


Capture 


112 


1275-1307 


Label 


113 


1308-1340 


Label 


114 


1341-1373 


Label 


115 


1374-1406 


Label 


116 


1407-1439 


Capture 


117 


1440-1472 


Label 


118 


1473-1505 



25 

Nucleic acid sequences which correspond to 
nucleotide sequences in the C gene and the 5'UT region 
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are set forth in sequences 119-145. Table 4 identifies 
the sequence with a preferred use. 

Table 4 

5 

Probe T:ype Sequence No. 





Capture 


119 




Label 


120 


10 


Label 


121 




Label 


122 




Capture 


123 




Label 


124 




Label 


125 


15 


Label 


126 




Capture 


127 




Label 


128 




Label 


129 




Label 


130 


20 


Capture 


131 




Label 


132 




Label 


133 




Label 


134 




Label 


135 


25 


Capture 


136 




Label 


137 




Label 


138 
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Table 4 continued 



Probe Type sequence No. 



5 Label 139 

Capture 140 

Label 141 

Label 142 

Label 143 

10 Capture 144 

Label 145 



The detection and capture probe HCV-speci£ie 
segments, and their respective names as used in this 
assay were as follows. 
15 Capture sequ^ices are sequences n\]mbered 

119-122 and 141-144. 

Detection sequences are sequences n;snbered 

119-140. 

Each detection sequence contained, in addition to 
20 the sequences substantially complementary to the HCV 
sequences, a 5' extension (B) which extension (B) is 
complementary to a segment of the second amplifier 
nucleic acid. The extension (B) sequence is identified 
in the Sequence Listing as Sequence No. 146, and is 
25 reproduced below. 

A66CATAGGACCC6TGTCTT 
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Each capture sequence contained, in addition to 
the sequences substantially complementary to HCV 
sequences, a sequence complementary to PNA bound to a 
solid phase. The sequence complementary to DMA bound 

5 to a solid support was carried downstream from the 
capture sequence. The sequence complementary to the 
DNA bound to the support is set forth as Sequence No. 
147 and is reproduced below. 

CTTCTTTGGAGAAAGTGeTG 

10 Microtiter plates were prepared as follows. White 

Kierolite 1 Semovawell strips (polystyrene microtiter 
plates, 96 wells/plate) were purchased from Dynatech 

Inc. • , 

Each well was filled with 200 ^1 1 N HCl and 

15 incubated at room temperature for 15-20 min. The 
plates were then washed 4 times with IX PBS and the 
wells aspirated to remove liquid. The wells were then 
filled with 200 ill 1 N NaOH and incubated at room 
temperature for 15-20 min. The plates were again 

20 washed 4 times with IX PBS and the wells aspirated to 

remove liquid. 

PolyCphe-lys) was purchased from Sigma Chemacals, 
Inc. This polypeptide has a 1:1 molar ratio of phe:lys 
and an average m.w. of 47,900 gm/mole. It has an 
25 average length of 309 amino acids and contains 155 

amines/mole. A l mg/ml solution of the polypeptide was 
mixed with 2M NaCl/lX PBS to a final concentration of 



f 
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O.l xng/ml (pH 6.0). A volume of 200 ^l of this 
solution was added to each well. The plate was wrapped 
in plastic to prevent drying and incubated at 30»C 
overnight. The plate was then washed 4 times with IX 

5 PBS and the wells aspirated to remove liquid. 

The following procedure was used to couple the 
nucleic acid, a complementary sequence to Sequence No. 
147, to the plates, hereinafter referred to as 
immobilized nucleic acid. Synthesis of immobilized 

10 nucleic acid haying a sequence complementary to 

sequence No. 133 was described in EPA 883096976. A 
quantity of 20 mg disuccinimidyl suberate was dissolved 
in 300 vl dimethyl formamide (DMF). A quantity of 26 
OD260 °^ immobilized nucleic acid was added to 

15 100 til coupling buffer (50 mM sodium phosphate, pH 
7.8). The coupling mixture was then added to the 
DSS-DMF solution and stirred with a magnetic stirrer 
for 30 min. An HAP-25 column was equilibrated with 10 
mK sodium phosphate, pH 6.5. The coupling mixture 

20 DSS-DMF solution was added to 2 ml 10 bM sodium 

phosphate, pH 6.5, at 4«C. The mixture was vortexed to 
mix and loaded onto the equilibrated NAP-25 column. 
DSS-activated immobilized nucleic acid DNA was eluted 
from the column with 3.5 ml 10 nM sodium phosphate, pH 

25 6.5. A quantity of 5.6 ODjgo ^^^^ °^ elated. 

DSS-activated immobilized nucleic acid DNA was added to 
1500 ml 50 bjM sodium phosphate, pH 7.8. A volume of 50 



1 
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pi Of this solution was added to each well and the 
plates were incubated overnight. The plate was then 
washed 4 times with IX PBS and the wells aspirated to 

remove liquid. 
5 Final stripping of plates was acconq?lished as 

follows. A volume of 200 iil of 0.2N MaOH containing 
0.5% (w/v) SDS was added to each well. The plate was 
wrapped in plastic and incubated at 65»C for 60 min. 
The plate was then washed 4 times with IX PBS and the 
10 wells aspirated to remove liguid. The stripped plate 
was stored with desiccant beads at 2-8'C. 

Serum saxnples to be assayed were analyzed using 
PCR followed by seguehce analysis to determine the 
genotype . 

15 Sample preparation consisted of delivering 50 ]il 

of the serum sample and 150 |il P-K Buffer (2 mg/ml 
proteinase K in 53 mM Tris-HCl, pH 8.0/0.6 M NaCl/0.06 
K sodium citrate/8 vK EDTA, pH 8.0/1.3%SDS/16pg/ml 
sonicated salmon sperm DNA/7% formamide/50 fmoles 

20 capture probes/160 fmoles detection probes) to each 
well. Plates were agitated to mix the contents in the 
well, covered and incubated for 16 hr at 62*C. 

After a further 10 minutie period at room 
tanperatxire, the contents of each well were aspirated 

25 to remove all fluid, and the wells washed 2X with 

washing buffer (0.1% SDS/0.015 M NaCl/ 0.0015 M sodium 
citrate) . The amplifier nucleic acid was then added to 
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each well (50 }xl of 0.7 £mole/}il solution in 0..48 
M NaCl/D.04B M sodiiun citrate/0.1% SDS/0.5% "blocking 
reagent" (Boehringer Mannheim, catalog No. 1096 176)}. 
After covering the plates and agitating to mix the 
contents in the wells, the plates were incubated for 30 
min. at 52*^0. 

After a further 10 min period at room temperature, 
the veils were washed as described above. 

Alkaline phosphatase label nucleic acid, disclosed 
in EP 883096976, was then added to each well (50 
;il/well of 2.66 fmoles/^l}. After incubation at 
52^C for 15 min., and 10 min. at room temperature, the 
wells were washed twice as above and then 3X with 0.015 
H NaCl/0.0015 N sodium citrate. 

An enzyme-triggered dioxetane (Schaap et al., Tet. 
Lett. (1987) 28:1159-1162 and EPA Pub. No. 0254051), 
obtained from Lumigen, Inc., was employed. A quantity 
of 50 ]xl Lumiphos 530 (Lumigen) was added to each 
well. The wells were tapped lightly so that the 
reagent would fall to the bottom and gently swirled to 
distribute the reagent evenly over the bottom. The 
wells were covered and incubated at 37*C for 20-40 man. 

Plates were then read on a Dynatech ML 1000 
luminometer. Output was given as the full integral of 
the light produced during the reaction. 

The assay positively detected each of the serum 
samples, regardless of genotype. 
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IV. Tiiv presBion of ^>>** Polvpept ^'^'' Br^roded in SecTuences 
pefined by Differin g senotvoes 

HCV polypeptides encoded by a seguence within 
sequences 1-66 are expressed as a fusion polypeptide 
5 with superoxide dismutase (SOD) . A cDNA carrying such 
seguences is subcloned into the expression vector 
pSODcfl (Steimer et al. 1986)). 

First, DNA isolated from pSODcfl is treated with 
BamHI and EcoRI, and the following linker was ligated 
10 into the linear DNA created by the restriction enzymes: 
5 OAT CCT 06A ATT CTG ATA AGA 

CCT TAA 6AC TAT TTT AA 3 
After Cloning, the plasmid containing the insert is 

isolated. . ^ .*v 

Plasmid containing the insert is restricted with 

EcoRI. The HCV CDNA is ligated into this EcoRl 

linearized plasmid DNA. The DNA mixture is used to 

transform E. coli strain D1210 (Sadler et al. (1980)). 

Polypeptides are isolated on gels. 

V. Antigenicity of Poly peptides 

The antigenicity of polypeptides formed in Section 
IV is evaluated in the following manner. Polyethylene 
pins arranged on a block in an 8 12 array (Coselco 
25 Himetopes, Victoria, Australia) are prepared by placing 
the pins in a bath (20% v/v piper idine in 
dimethylformamide (DMF)) for 30 minutes at room 



15 



20 



\ 
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temperature. The pins are removed, washed in DMT for 5 
minutes, then washed in methanol four tines (2 
min/wash) . The pins are allowed to air dry for at 
least 10 minutes, then washed a final time in DMF 
5 (5Min). l-Hydroxybenzotriazole (HOBt, 367 mg> is 
dissolved in DMF (80 ^L) for use in coupling 
Bnoc-protected polypeptides prepared in Section IV. 

The protected amino acids are placed in 
micro-titer plate wells with HOBt, and the pin block 

10 placed over the plate, immersing the pins in the 

wells. The assembly is then sealed in a plastic bag 
and allowed to react at 25 "C for 18 hours to couple the 
first amino acids to the pins. The block is then 
removed, and the pins washed with DMP (2 min.), MeOH 

15 (4 X, 2 min.), and again with DMF (2 min.) to clean and 
deprotect the bound amino acids. The procedure is 
repeated for each additional amino acid coupled, until 
all octamers are prepared. 

The free H-termini are then acetylated to 

20 compensate for the free amide, as roost of the epitopes 
are not found at the N-terminus and thus would not have 
the associated positive charge. Acetylation is 
accomplished by filling the wells of a microtiter plate 
with DMF/acetic anhydride/triethyl amine (5:2:1 v/v/v) 

25 and allowing the pins to react in the wells for 90 

minutes at 20»C. The pins are then washed with DMF (2 



SUBSTITUTE SHEBT 



wo 92/19743 



PCr/US92/04036 



-so- 



lo 



min.) and MeOH (4 2 min.), and air dried for at 
least 10 minutes. 

The side chain protecting groups are removed by 
treating the pins with trifluoroacetic acid/phenol/ 
dithioethane (95:2.5:1.5, v/v/v) in polypropylene bags 
for 4 hours at room tenrperature . The pins are then 
washed in dichloromethane (2 x, 2 min.), 5% 
di-isopropylethylamine/dichloromethane (2 x, 5 min. ) , 
dichloromethane. (5 min.), and air-dried for at least 10 
minutes. The pins are then washed in water (2 min.), 
MeOH (18 hours), dried in vacuo , and stored in sealed 
plastic bags over silica gel. IV.B.lS.b Assay of 
Peptides. 

Octamer-bearing pins are treated by sonicating for 
15 30 minutes in a disruption buffer (1% sodium 

dodecylsulfate, 0.1% 2-mercaptoethanol, 0.1 H NaH2P04) 
at 60 'C. The pins are then immersed several times in 
water (60-C), followed by boiling MeOH (2 min.), and 
allowed to air dry. 
20 The pins are then precoated for 1 hour at 25 in 

microtiter wells containing 200 pL blocking buffer 
(1% ovalbumin, 1% BSA, 0.1% Tween, and 0.05% NaN3 in 
. PBS), with agitation. The pins are then immersed in 
microtiter wells containing 175 \iL antisera obtained 
25 from human patients diagnosed as having HCV and allowed 
to incubate at 4»C overnight. The formation of a 
complex between polyclonal antibodies of the serum and 
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the polypeptide initiates that the peptides give rise 
to an innnune response in vivo. Such peptides are 
candidates for the development of vaccines. 

Thus, this invention has been described and 
5 illustrated. It will be apparent to those skilled in 
the art that many variations and modifications can be 
made without departing from the purview of the appended 
claims and without departing from the, teaching and 
scope of the present invention. 
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fiE pHEHCE LISTING 
(1) GENERAL INFOi?MATION: 

(i) APPLICANT: Tai-An Cha 

(ii) TITLE OF INVENTION: HCV GENOMIC SEQUENCES 
FOR DIAGNOSTICS AND THERAPEUTICS 

(iii) NUMBER OF SEQUENCES: 147 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Wolf, Greenfield & Sacks, 

(B) STREET: 600 Atlantic Avenue 

(C) CITY: Boston 

(D) STATE: Massachusetts 

(E) COUNTRY: USA 
(F> ZIP: 02210 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette, 5.25 inch 

(B) COMPUTER: IBM compatible 

(C) OPERATING SYSTEM: MS-DOS Version 3.3 

(D) SOFTWARE: WordPerfect 5.1 
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(Vi> CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: Not Available 

(B) FILING DATE: Not Available 

(C) CLASSIFICATION: Not Available 

5 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 07/697,326 

(B) FILING DATE: 8 May 1991 

10 (Viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Janiuk, Anthony J. 

(B) REGISTRATION NUMBER: 29,809 

(C) REFERENCE/DOCKET NUMBER: C0772/7000 

15 (ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (617) 720-3500 

(B) TELEFAX: (617)720-2441 

(C) TELEX: EZEKIEL 

20 (2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 nucleotides 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: (ATCC # 40394) 

(C> INDIVIDUAL ISOLATE: nsShcvl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1 
CTCCACAGTC ACT6AGAGCG ACATCCGTAC G6A66A6QCA 
ATCTACCAAT GTTGT6ACCT. CGACCCCCAA 6CCCGCGTG6 
CCATCAAGTC CCTCACCGAG A6GCTTTATG TTGG6GGCCC 
TCTTACCAAT TCAAQ66GGG A6AACT6CGG CTATC6CAG6 
TGCCGCGC6A GC6GCGTACT GACAACTAGC TGTGGTAACA 
CCCTCACTTG CTACATCAA6 GCCCGG6CAG CCTGTCGAGC 
C6CA66GCTC CAGGACTGCA CCATGCTCGT GTGTGGCGAC 
GACTTAGTC6 TTATCTGTGA AA6CGCGGGG GTCCAGGAG6 
15 ACGC66CGAG CCT6AGAGCC 

(2) INFORMATION FOR SEQ ID NO: 2: 



10 



20 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 nucleotides 

(B) TYPE: liucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



25 <ii) MOLECULE TYPE: DNA 



40 
80 
120 
160 
200 
240 
280 
320 
340 
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(Vi) ORIGINM. SOURCE: 

(C) Iin)IVIDI]AL ISOLATE: ns5i21 

(xi) SEQT3ENCE DESCRIPTION: SEQ ID NO: 2 

5 CTCCACAGTC ACT6A6A6C6 ACATCC6TAC 66A06A6GCA 40 

ATTTACCAAT GTTGT6ACCT G6ACCCCCAA GCCCGCATGG 80 

CCATCAA6TC CCTCACTGA6 AGGCTTTAT6 TCGQGG6CCC 120 

TCTTACCAAT TCAAG6GG0G AGAACTGC6G CTACC6CAGG 160 

TGCCGCGCGA GCGGCGTACT GACAACTAGC TGTGGTAACA 200 

10 CCCTCACTT6 CTACATCAAO 6CCC0G6CAG CCTGTCGAGC 240 

CGCA666CTC CAG6ACTGCA CCAT6CTTGT GTGTG6CGAC 280 

GACTTAGTCG TTATCTGTGA AA6TGC6G6G GTCCAG6AGG 320 

ACGCGGCGAG CCTGA6AGCC 340 

15 (2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 nucleotides 

(B) TYPE: nucleic acid 
20 (C) STRAMDEDNESS: single 

(D) TOPOLOGY: .linear 

(ii) MOLECULE TYPE: DNA 

25 (vi) ORIGINAL SOURCE: 

(C) individual isolate: ns5ptl 
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(xi) SEQUEITCE DESCRIPTION: SEQ ID NO: 3 

CTCCACAGTC ACTGAGA6CG ACATCCGTAC GGAGGAGGCA 40 

ATCTACCAAT 6TTGTGATCT GGACCCCCAA GCCC6C6TGG 80 

CCATCAAGTC CCTCACTGA6 AG6CTTTACG TTGGGGGCCC 120 

5 TCTTACCAAT TCAAGGGGGG AGAACTGC6G CTACCGCAGG 160 

TGCCGGGCGA GCGGCGTACT GACAACTAGC TGTGGTAATA 200 

CCCTCACTTG CTACATCAAG GCCCGGGCAG CCTGTCGA6C 240 

C6CA666CTC C6GGACTGCA CCATGCTCGT GTGTGGTGAC 280 

GACTTGGTCG TTATCT6IGA GAGTGCGGGG GTCCAGGAGG 320 

10 ACGCGGCGAG CCTGAGA6CC 340 

(2) INFORimTION FOR SEQ ID NO: 4 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 340 nucleotides 

(5) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) niDIVIDUAL ISOLATE: n85gin2 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4 

CTCTACAGTC ACTGAGAACG ACATCCGTAC GGAGGAGGCA 40 
ATTTACCAAT GTT6TGACCT GGACCCCCAA GCCCGCGTGG 80 
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CCATCAAGTC CCTCACTGA6 AGGCTTTAT6 TTGG6GGCCC 120 

CCTTACCAAT TCAA666G6G AAAACT6C66 CTATCGCAGG 160 

TGCCOCGCGA 6C66CGTACT 6ACAACTAGC TGTGGTAACA 200 

CCCTCACTTG CTACATTAAG GCCCGGGCAG CCTGTCGAGC 240 

5 CGCAGGGCTC CAGGACTGCA CCATGCTCGT GTGTG6CGAC 280 

GACTTAGTCG TTATCTGTGA 6AGTGC6G6A 6TCCAGGA6G 320 

ACGC6GCGAA CTTGAGAGCC 340 

(2) INFORMATION FOR SEQ ID NO: 5 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANBEDNESS: Single 
15 (S) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 
20 (C) INDIVIDUAL ISOLATE: ns5usl7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5 

CTCCACAGTC ACTGAGAGCG ATATCCGTAC GGAGGAGGCA 40 

ATCTACCAGT GTTGT6ACCT GGACCCCCAA GCCCGCGTGG 80 

25 CCATCAAGTC CCTCACCGAG AGGCTTTATG TCGG6GGCCC 120 

TCTTACCAAT TCAAGG6GGG AAAACTGCGG CTATCGCAGG 160 

T6CCGCGCAA GCGGCGTACT GACAACTAGC TGTGGTAACA 200 
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CCCTCACTT6 TTACATCAAG GCCCAAGCAG CCTGTC6AGC 240 

CGCAGGGCTC CGGGACTGCA CCATGCTCGT GTGTGGCGAC 280 

6ACTIAGTCG TTATCTGTGA AAGTCAGG6A GTCCAGGA6G 320 

AT6CA6CGAA CCI6A6A6CC 340 

5 

(2) lUFOSHATION FOR SEQ ID NO: 6 

(i> SEQUENCE CKAiiACTERISTICS: 

(A) LENGTH: 340 nucleotides 
10 (B) TYPE: nucleic acid 

(C) STIiANDEDNESS: single 
(9} TOPOLOGY: linear 



15 



(ii) HOLECXILE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: n85sp2 



ixi) SEQUENCE DESCRIPTION: SEQ ID NO: 6 

20 CTCTACAGTC ACT6AGAGCG ATATCCGTAC GGA6GAG6CA 40 

ATCTACCAAT 6TTGTGACCT G6ACCCC6AA GCCCGT6T6G 80 

CCATCAAGTC CCTCACT6A6 AG6CTTTAT6 TTGGGGGCCC 120 

TCTTACCAAT TCAAGG6G66 A6AACTGCG6 CTACCGCA6G 160 

TGCCGCGCAA GCGGCGTACT 6ACGACTAGC TGTG6TAATA 200 

25 CCCTCACTTG TTACATCAAG GCCCGGGCAG CCTGTCGAGC 240 

CGCAGGGCTC CA6GACTGCA CCATGCTCGT GTGTGGCGAC 280 



f 
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6ACCTAGTC6 TTATCT6C6A AAGT6C6G66 GTCCA6GA6G 320 
ACGCGGCGAG CCTGAGAGCC .340 

(2) INFOSMATION FOR SEQ ID NO: 7 

5 

(i) SEQDENCE CHARACTERISTICS: 

(A) LENGTH: 340 nucleotides 

(B) TYPE: nucleic acid 

(C) 8TRAMDEDMESS : single 
10 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 
15 (C) INDIVIDUAL ISOLATE: nsSjl 

(zi) SEQUENCE DESCRIPTION: SEQ ID NO: 7 

CTCCACAGTC ACT6AGAATG ACACCCGTGT TGAGGAGTCA 40 

ATTTACCAAT 6TT6TGACTT GGCCCCCGAA 6CCAGACAGG 80 

20 CCATAAGGTC 6CTCACA6AG C6GCTCTAT6 TC6GG66TCC 120 

TAT6ACTAAC TCCAAA66GC A6AACTGCGG CTATCGCC6G 160 

T6CCGCGC6A GC6GC6TGCT GACGACTAGC TGC66TAATA 200 

CCCTCACATG CTACCTGAA6 GCCACAGC6G CCTGTCGA6C 240 

TGCCAAGCTC CAG6ACTGCA CGAT6CTCGT GAAC6GAGAC 280 

25 . GACCTTGTC6 TTATCTGT6A AAGCGC666G AACCAAGAGG 320 

ACGCGGCAAG CCTACGAGCC 340 
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(2) INFOKMATION FOR SEQ ID NO: 8 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 nucleotides 

(B) TXPE: nucleic acid 

(C) STFANDEDNESS: single 
(D> TOPOLOGY: linear 

(ii) . MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: ns5U 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8 
,5 CTCAACGGTC ACTGAGAATG ACATCCGT6T T6A6GAGTCA 40 

ATTTACCAAA GTTGTGACTT 66CCCCCGA6 6CCAGACAAG 80 
CCATAA6GTC GCTCACA6AG C6GCTTTACA TCG6GGGCCC 
CCTGACTAAT TCAAAAGGGC AGAACTGCGG CTATCGCCGA 
TGCCOCGCCA GCG6TGT6CT 6ACGACTAGC TGCGGTAATA 
,0 CCCTCACATG TTACTT6AAG GCCACTGCGG CCTGTAGAGC 240 

TGC6AAGCTC CA66ACTGCA CGATGCTCGT GTGCGGAGAC 280 
GACCTTGTCG TTATCTGTGA AAGCGCG6GA ACCCAG6AGG 
ATGCGGCGAG CCTACGAGTC 

25 (2) INFORMATION FOR SEQ ID NO: 9 



120 
160 
200 



320 
340 



1 

\ 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 nucleotides 

(B) TYPE: nucleic acid 

(C) STSANDEDNESS : single 
5 (]» TOPOLOS?: linear 

(ii) MOLECULE TYPE: PNA 

(vi) ORIGINAL SOURCE: 
10 (C) INDIVIDUAL ISOLATE: nsSkl.l 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9 

CTCAACGGTC ACCGA6AATG ACATCC6T6T TGA6GAGTCA 40 

ATTTATCAAT 6TT6TGCCTT G6CCCCCGAG GCTAGACAGG 80 

15 CCATAA66TC 6CTCACA6A6 CG6CTTTATA TC66666CCC 120 

CCTGACCAAT TCAAA666GC AGAACT6CG6 TTATCGCCG6 160 

TGCCGCGCCA GC66CGTACT GACGACCAGC TGCGGTAATA 200 

CCCTTACATG TTACTTGAA6 GCCTCTGCAG CCTGTCGAOC 240 

CGCGAAGCTC CAGGACT6CA CGATGCTCGT GTGTGGGGAC 280 

20 6ACCTTGTCG TTATCTGTGA AA6CGCG66A ACCCAG6AGG 320 

ACGCGGCGAA CCTAC6AGTC 340 

(2) INFORMATION FOR SEQ ID NO: 10 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 nucleotides 

(B) TYPE: nucleic acid 
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(C) STRANDEDiaESS: single 
(S) TOPOLOGY: linear 

(ii) KOLECOLE TYPE: SHA 

5 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: ns5gh6 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10 

10 CTCAACG6TC ACT6AGAGTG ACATCCGTGT CGAGGAGTCG 40 

ATTTACCAAT 6TT6T6ACTT 6GCCCCC6AA GCCA6GCA6G 80 

CCATAAG6TC 6CTCACCGAG CGACTTTATA TCGG6GGCCC 120 

CCT6ACTAAT TCAAAAG66C A6AACT6C6G TTATC6CC6G 160 

TGCC6CGCGA 6C6GCGTGCT 6ACGACTAGC TGCG6TAATA 200 

15 CCCTCACAT6 TTACTT6AA6 GCCTCTGCAG CCTGTCGAGC 240 

TGCAAAGCTC CAGGACTGCA C6ATGCTCGT 6AAC6GGGAC 280 

GACCTTGTCG TIATCTGCGA 6A6CGC6GGA ACCCAAGAGG 320 

ACGCGGCGAG CCTACGAGTC 340 

20 (2) INFORMATION FOR SEQ ID NO: 11 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 nucleotides 

(B) TYPE: nucleic acid 

25 . (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



r 
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(ii) KOLECDLE TYPE: DKA 

(Vi) ORZGZKAL SOURCE: 

(C) IKDIVIDDAL ISOLATE: nsSspl 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11 

CTCCACAGTC ACTGAGAGTG ACATCCGTGT TGAGGAGTCA 40 

ATTTACCAAT GTT6TGACTT GGCCCCCGAA GCCAGACAGG 80 

CTATAAGGTC GCTCACAGAG CGGCTGTACA TCGGGGGTCC 120 

10 CCTGACTAAT TCAAAAGG6C AGAACTGCGG CTATC6CCGG 160 

T6CCGC6CAA GCGGCGT6CT 6ACGACTAGC T6CGGTAACA 200 

CCCTCACAT6 TTACTTGAAO 6CCTCT6CG6 CCT6TC0AGC 240 

T6CGAA6CTC CAGGACT6CA CGATGCTCGT GTGCGGTGAC 280 

GACCTT6TCG TTATCTGTGA GA6CGC6GGA ACCCAAGAGG 320 

15 ACGCG6CGAG CCTAC6A6TC 340 

(2) INFORMATION FOR SEQ ID NO: 12 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 340 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 



SUBSTITUTE SHEBT 



wo 92/19743 



PCr/US92/04036 



- 64 - 



(C) individual isolate: ns5sp3 

(xi) SEQOEHCE DESCRIPTION: SEQ ID NO: 12 

CTCaACAGTC ACTGA6AGT6 ACATCCGT6T TGAGGAGTCA 40 

5 ATCTACCAAT GTTGTGACTT GGCCCCCGAA 6CCA0ACAGG 80 

CTATAAG6TC GCTCACA6A6 CGGCTTTACA TC6GGGGTCC 120 

CCT6ACTAAT TCAAAAGG6C AGAACTGCG6 CTATC6CCGG 160 

TGCCGCGCAA GCGGCGTGCT GACGACTAGC TGCGGTAATA 200 

CCCTCACATG TTACCTGAAG 6CCAGT6CGG CCTGTCGAGC 240 

10 TGCGAAGCTC CAG6ACTGCA CAATGCTCGT GTGCGGTGAC 280 

GACCTTGTCG TTATCTGTGA GAGCGCGGGG ACCCAAGAGG 320 

ACGCGGCGAG CCTACGAGTC 3^° 

(2) INFORMATION FOR SEQ ID NO: 13 

15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(vi ) ORIGINAL SOURCE : 
25 (C) INDIVIDUAL ISOLATE: .ns5k2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 
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CTCAACCGTC ACTGA6AGAG ACATCAGAAC TGAGGAGTCC 40 

ATATACCGAG CCTGCTCCCT 6CCTGAGGAG GCTCACATTG 80 

CCATACACTC GCTGACTGAG AGGCTCTACG TGGGAGGGCC 120 

CATGTTCAAC A6CAA66GCC A6ACCT6CG6 GTACAG6CGT 160 

TGCCGCGCCA GC6GGGTGCT CACCACTAGC ATGGGGAACA 200 

CCATCACATG CTATGTAAAA GCCCTAGCGG CTTGCAAG6C 240 

TGCAGGGATA GTTGCACCCT CAATGCTGGT ATGCGGCGAC 280 

6ACTTAGTTG TCATCTCA6A AAGCCAGGGG ACTGAGGAGG 320 

ACGAGC6GAA CCTGAGAGCT . 340 



10 



(2) IKFOKHATION FOR SEQ ID NO: 14 



(1) SEQUENCE CKAKACTERZSTICS; 

(A) LENGTH: 340 nucleotides 
^5 (B) TYPE: nucleic acid 

(C) 8TKANDEDNESS : single 

(D) TOPOLOGY: linear 



20 



(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: ns5arg8 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 

25 CTCTACAGTC ACGTAAAAGG ACATCACATC CTAGGAGTCC 40 

ATCTACCAGT CCTGTTCACT GCCCGAGGAG GCTCGAACTG 80 

CTATACACTC ACTGACTGAG AGACTATACG TAGGGGGGCC 120 
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CATGACAAAC A6CAAG66CC AATCCT6C60 6TACAGGCGT 
T6CCGC6C6A GCGCAGTGCT CACCACCAGC AT6GGCAACA 
CACTCAC6T6 CTACGTAAAA GCCAG6GCGG CGTGTAACGC 
C6C6G6GATT GTT6CTCCCA CCATGCTGGT GTGCGGTGAC 
5 6ACC3G6TCG TCATCTCAGA GAGTCAAGGG GCTGAGGAGG 

ACGAGCA6AA CCTGAGAGTC 

(2) INFORMATION FOR SEQ ID NO: 15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 nucleotides 

(B) T2PE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



15 



20 



25 



160 
200 
240 
280 
320 
340 



(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: nsSilO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15 

CTCTACA6TC ACAGAGA6GG ACATCAGAAC CGAGGAGTCC 40 

ATCTATCTGT CCTGCTCACT GCCTGAGGAG GCCCGAACTG 80 

CTATACACTC ACTGACTGAG AGACTGTAC6 TAGGGGGGCC 120 

CATGACAAAC AGCAAGGGGC AATCCTGC6G GTACAG6C6T 160 

TGCCGCGCGA GCGGAGTGCT CACCACCAGC ATGGGCAACA 200 

CGCTCACGTG CTACGTGAAA GCCAGAGCGG CGTGTAACGC 240 
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CGC6GGCATT GTT6CTCCCA CCATGTT6GT 6T6CG6CGAC 280 
GACCTGGTTG TCATCTCAGA GAGTCA6GG6 GTC6AG6AA6 320 
ATGAGCG6AA CCTGAGAGTC 340 

5 (2) INFORMATION FOR SEQ ID NO: 16 

(i) SEQUINCE CHARACTERISTICS: 

(A) LENGTH: 340 nucleotides 

(B) TYPE: nucleic acid 

10 (C) STRANTEDNESS: single 

(») TOPOLOGY: linear 

(li) MOLECDLE TYPE: DMA 

15 (vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: ns5arg6 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16 

CTCTACAGTC ACGGAGAGGG ACATCAGAAC CGAGGAGTCC 40 

20 ATCTATCTGT CCTGTTCACT GCCTGAGGAG GCTCGAACTG 80 

CCATACACTC ACTGACTGAG A6GCTGTACG TAGGGGGGCC 120 

. CATGACAAAC AGCAAAGGGC AATCCTGCGG GTACAGGCGT 160 

TGCC6C6CGA 6CGGAGTGCT CACCACCAGC ATGGGTAACA 200 

CACTCAC6T6 CTACGT6AAA GCTAAAGCGG CATGTAACGC 240 

25 C6CGGGCATT GTTGCCCCCA CCATGTTGGT GTGCGGCGAC 280 

GACCTAGTCG TCATCTCAGA 0AQTCAAG6G GTCGAGGAGG 320 

AT6AGCGAAA CCTGA6AGCT ^AO 
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(2) INFORMATION FOR SEQ ID NO: 17 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTO: 340 nucleotides 

(B) TXPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGy: linear 

(ii) MOLECULE TXPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: ns5k2b 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17 

CTCAACCGTC ACGGAGAGGG ACATAAGAAC AGAAGAATCC 40 

ATATATCAG6 GTTGTTCCCT GCCTCAGGAG GCTA6AACTG 80 

CTATCCACTC GCTCACTGAG A6ACTCTACG TAGGAGGGCC 120 

CATGACAAAC AGCAA6GGAC AATCCTGCG6 TTACAGGCGT 160 

TGCCGCGCCA 6CGG6GTCTT CACCACCAGC ATGGGGAATA 200 

CCAT6ACATG CTACATCAAA GCCCTTGCAG CGT6CAAA6C 240 

TGCAG6GATC GT66ACCCTA TCATGCTGGT GTGTGGAGAC 280 

iSACCTGGTCG TCATCTCGGA GA6CGAA6GT AACGAG6AG6 320 

AC6AGC6AAA CCT6AGAGCT 340 

25 (2) INFORMATION FOR SEQ ID NO: 18 

(i) SEQUENCE CHARACTERISTICS: 



15 



20 
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10 



20 



(A) LENGTH: 340 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
(D TOPOLOGY: linear 

(ii) NOLECDLE TYPE: DMA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: ns58a283 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18 

CTCGACCGTT ACCGAACATG ACATAAT6AC TGAAGAGTCT 40 

ATTTACCAAT CATTGTACTT 6CA0CCT6A6 6CGCGTGT6G BO 

CAATACGGTC ACTCACCCAA CGCCTGTACT GT6GA66CCC 120 

15 CATGTATAAC A6CAAGGGGC AACAATGT6G TTATCGTAGA 160 

TGCCGCGCCA GCGGCGTCTT CACCACTAGT ATGGGCAACA 200 

CCATGACGTG CTACATTAAG GCTTTAGCCT CCTGTAGAGC 240 

C6CAAAGCTC CAG6ACTGCA CGCTCCT6GT GTGTGGTGAT 320 

GATAAAGCGA CCTGAGAGCC 340 



(2) INFORMATION FOR SEQ ID NO: 19 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 nucleotides 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: nsSsalSS' 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19 

CTCGACCGTT ACCGAACATG ACATAATGAC T6AAGAGTCC 40 

ATTTACCAAT CATTGTACTT GCAGCCTGAG GCAC6CGC6G 80 

CAATACGGTC ACTCACCCAA CGCCTGTACT 6TGGAG6CCC 120 

10 CATGTATAAC AGCAAGGGGC AACAATGTGG TTACCGTAGA 160 

TGCCGCGCCA GCGGC6TCTT CACCACCAGT ATGGGCAACA 200 

CCATGACGTG CTACATCAAG GCTTCA6CCG CCTGTAGAGC 240 

TGCAAAGCTC CAG6ACTGCA CGCTCCTGGT GTGTGGTGTG 280 

ACCTT6GTGG CCATTTGC6A GAGCCAAGGG ACGCACGAGG 320 

15 ATGAAGCGTG CCTGAGA6TC 340 

(2) INFORMATION FOR SEQ ID NO: 20 

(i> SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 340 nucleotides 

(5) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 
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(C) INDIVIDTIAL ISOLATE: nsSill 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20 

CTCTACTGTC ACTGAACA6G ACATCAGGGT GGAAGAG6AG 40 

5 ATATACCAGT GCTGTAACCT TGAACCGGAG GCCAG6AAAG 80 

T6ATCTCCTC CCTCACGGAG CGGCTTTACT GCGGGG6CCC 120 

TATGTTCAAC AGCAAGGGGG CCCAGT6TGG TTATCGCC6T 160 

TGCCGTGCTA GTGGAGTCCT GCCTACCAOC TTC6GCAACA 200 

CAATCACTTG TTACATCAA6 GCTAGA6CG6 CTTCGAAOGC 240 

10 CGCAGGCCTC C6GAACCC06 ACTTTCTTGT CTGCGGAGAT 280 

GATCTGGTC6 TGGTGGCTGA GAGTGATGGC GTCGACGAGG 320 

ATAGAGCA6C CCTGAGA6CC 340 

(2) INFOSMATiON FOR SEQ ID NO: 21 

15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 
25 (C) INDIVIDUAL ISOLATE: ns5i4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21 
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CTCGACTGTC ACTGAACA66 HCkTCUGGGT 6GAA6A66AG 40 

ATAIACCAAT 6CTGTAACCT TGAACC66AG GCCAGGAAAG 80 

TGATCTCCTC CCTCACGGAG CG6CTTTACT GCGGGG6CCC 120 

TATGTTCAAT A6CAA6GGGG CCCAGT6TGG TTATCGCC6T 160 

TGCCGTGCTA GT6GAGTTCT GCCTACCAGC TTCGGCAACA 200 

CAATCACTTG TTACATCAAG GCTAGAGCGG CTGCGAAGGC 240 

CGCAGGGCTC CG6ACCCCGG ACTTTCTCGT CTGCGGAGAT 280 

GATCTGGTTG TGGTGGCTGA GAGTGATGGC GTCGACGAGG 320 

ATAGAACAGC CCTGCGAGCC 340 



10 



(2) INFOKHATION FOR SEQ ID NO: 22 



(i) SEQUENCE CHASACTERISTICS: 

(A) LENGTH: 340 nucleotides 
15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



20 



(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: nsSghB ' 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22 

25 CTCAACTGTC ACTGAACAGG ACATCAGGGT GGAAGAGGAG 40 

ATATACCAAT 6CTGTAACCT TGAACCGGAG GCCAGGAAAG 80 

TGATCTCCTC CCTCACGGAA CGGCTTTACT 6CGGGGGCCC 120 
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15 



20 



25 



340 



TATGTTCAAC A6CAA606GG CCCAGTGTGG TTATCGCC6T 160 

TGCCGTGCCA 6TGGAGTTCT GCCTACCAGC TTCGGCAACA 200 

CAATCACTTG TTACATCAAA 0CTAGA6C60 CTGCC6AAGC 240 

CGCAGGCCTC CGGAACCCGG ACTTTCTTGT CTGCGGA6AT 280 

5 6ATCTGGTT6 T6GTGGCTGA 6A6T6ATG0C GTCAAT6AGG 320 

ATAGAGCAGC CCTG6GA6CC 

(2) INFORMATION FOR SEQ ID NO: 23 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECOLE TYPE: DNA 

(Vi) ORIGINAL SOURCE: (ATCC # 40394) 
(C) INDIVIDUAL ISOLATE; hcvl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23 
6ACGGCGTTG GTAATGGCTC AGCTGCTCCG GATCCCACAA 
GCCATCTTGG ACATGATCGC TGGTGCTCAC TGGGGAGTCC 80 
T6GCGGGCAT AGCGTATTTC 

(2)* INFORMATION FOR SEQ ID NO: 24 



40 



1 
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(i) SEQUENCE C3IARACSERISTICS: 

(A) LEN3TH: 100 nucleotides 

(B) TXPE: nucleic acid 

(C) STRABDEDMESS ; single 
5 (D) TOPOLOGY: linear 

(ii) HOLECDLE TSFE: DBA 

(vi) ORIGINAL SOimCE: 
10 (C) INDIVIDUAL ISOLATE: US5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24 

GACG6CGTTG GTGGTAGCTC AGGTACTCCG GATCCCACAA 40 

6CCATCATG6 ACAT6ATCGC TGGAGCCCAC TGGGGAGTCC 80 

15 TG6CGG6CAT A6C6TATTTC 100 

(2) INFORMATION FOR SEQ ID NO: 25 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 100 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 
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(C) INDIVIDUAL ISOLATE: AUS5 

(xi) SEQUENCE DESCRIPTION: SEQ ID MO: 25 

AACGGCGCTG 6TAGTA6CTC AGCTGCTCAG GGTCCCGCAA 40 

5 GCCATC6T6G ACATGATCGC T6GTGCCCAC T666GAGTCC 80 

TAGCG06CAT AGCOTATTTT 100 

(2) INFORMATION FOR SEQ ID NO: 26 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 nucleotides 

(B) TYPE: nucleic acid 

(C) 8TRANDEDNESS : single 

(D) TOPOLOGY: linear 



15 



20 



25 



(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: US4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26 

GACAGCCCTA GTGOTATCGC A6TTACTCCG GATCCCACAA 40 

GCCGTCATGG ATATGGTGGC GG6GGCCCAC TGGGGAGTCC 80 

TG6CGGGCCT TGCCTACTAT 100 

(2) INFORMATION FOR SEQ ID NO: 27 
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(i) SEQUENCE OiASACTERISTICS : 

(A) LENGTH: 100 nucleotides 

(B) TYPE: nucleic acid 

(C) SISAKDEDNESS: single 
5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: mtA 

(vi) ORIGINAL SOURCE: 
10 (C) INDIVIDUAL ISOLATE: AR62 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27 

AGCAGCCCTA GTGGTGTC6C AGTTACTCCG GATCCCACAA 40 

AGCATCGT6G ACATGGTGGC 6GG6GCCCAC T66G6A6TCC 80 

15 TGGCGGGCCT T6CTTACTAT 100 

(2) INFORKATIQN FOR SEQ ID NO: 28 

(i) SEQUENCE CHARACTERISTICS: 
2b (A) LENGTH: 100 nucleotides 

. (B) TYPE: nucleic acid 

(C) STRANDEDNES8: single 

(D) TOPOLOGY: linear 

25 (ii) HOLECULE TYPE: . DNA. 

(vi) ORIGINAL SOURCE: 
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(C) IKDIVIDUAL ISOLATE: 115 

(xi) SEQUENCE DESCRIPTION: SEQ II> NO: 28 

GGCA6CCCTA 6TGGTGTCGC A6TTACTCCG 6ATCCC6CAA 40 

5 6CT6TCGTGG ACAT66T66C G6608CCCAC TG60GAATCC 80 

TAGC66GTCT T6CCTACTAT 100 

(2) INFOKKATION FOR SEQ ID NO: 29 

10 (i)" SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 nucleotides 

(B) TCTE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



15 



20 



25 



(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: GH8 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29 

TGTGGGTATG GT6GTGGCGC ACGTCCTGCG TTT6CCCCAG 40 

ACCTTGTTC6 ACATAATAGC CGGGGCCCAT TGGGGCATCT 80 

TGGCGG6CTT G6CCTATTAC loo 

(2) INFORMATION FOR SEQ ID NO: 30 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 nucleotides 

(B) TYPE: nucleic acid 

(C) 8TRANDEDNESS : single 
5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: VSk 

(vi) ORIGINAL SOURCE: 
10 (C) INDIVIiniAL ISOLATE: 14 

. (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30 

TGT6G6TAT6 GTGGTA6CAC ACGTCCTGCG TCTGCCCCA6 40 

ACCTTGTTC6 ACATAATAGC CGGGGCCCAT TG6GGCATCT 80 

15 TG6CAGGCCT A6CCTATTAC 100 

(2) INFORMATION FOR SEQ ID NO: 31 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 100 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: DNA 

(yi) ORIGINAL SOURCE: 
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. (C) INDIVIDUAL ISOLATE: 111 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31 

TGTGGGTATG 6T6GTGGCGC AAGTCCTGCG TTT6CCCCAG AO 

5 ACCTT6TTCG ACGTGCTAGC CGGGGCCCAT TGGOOCATCT 80 

T6GCGGGCCT GGCCTATTAC 100 

(2) INFORMATION FOR SEQ ID NO: 32 

10 (i) 8EQ13ENCE CHARACTERISTICS: 

(A) LENGTH: 100 nucleotides 

(B) T!rP£: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



15 



(ii) MOLECULE TYPE: DMA 



(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 110 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32 
TACCACTATG CTCCTG6CAT ACTT6GT6CG CATCCCGGAG 40 
GTCATCCTGG ACATTATCAC G0GA6GACAC TGGGGCGT6A 80 
TGTTTGGCCT G6CTTATTTC 100 

25 

(2) INFORMATION FOR SEQ ID NO: 33 



SUBSTITUTE SHEET 



\ 



wo 92/19743 



PCT/US92/04036 



- 80 - 



(i) - SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252 nucleotides 
(5) TYPE: nucleic acid 
(C) STRANDEDNESS: single 
5 (3» TOPOLOGY: linear 

(ii) KOLECDLE TYPE: DNA 

(vi) ORIGINAL SOURCE: (ATCC # 40394} 
10 (C) INDIVIDUAL ISOLATE: hcvl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33 

6TTAGTATGA GTGTCGTGCA GCCTCCA6GA CCCCCCCTCC 40 

CGGGAGAGCC ATAGTGGTCT GCGGAACCGG T6AGTACACC 80 

15 GGAATTGCCA GGAC6ACCGG GTCCTTTCTT GGATCAACCC 120 

GCTCAATGCC TGGAGATTTG GGCGTGCCCC CGCAAGACTG 160 

CTA6CCGAGT AGTGTTGGGT CGCGAAAGGC CTTGTGGTAC 200 

TGCCTGATAG G6TGCTTGCG AGTGCCCCGG GAGGTCTCGT 240 

AGACCGTGCA CO 252 



20 



(2) INFORMATION FOR SEQ ID NO: 34 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252 nucleotides 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



f 
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(ii) MOLECULE TYPE: DNA 

(Vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: us5 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34 
6TTA6TAT6A GT6TCGT0CA 6CCTCCA6SA CCCCCCCTCC 40 
C6GGAGA6CC ATAGTGGTCT GC6GAACC6G T6A6TACACC 80 
GGAATTGCCA 66ACGACCGG GTCCTTTCTT GGATCAACCC 120 
10 6CTCAATGCC TGGAGATTTG GGC6TGCCCC CGCAAGACTG 160 

CTAGCCGAGT AGTGTTGGGT CGCGAAA6GC CTTGTGGTAC 200 
TGCCTGATAG GGTGCTTGCG AGTGCCCCGG GAGGTCTCGT 240 
AGACCGTGCA CC 252 

15 (2) INFORMATION FOR SEQ ID NO: 35 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252 nucleotides 

(B) TYPE: nucleic acid 

20 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

25 (vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: ausl 



1 
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(xi) SEQUENCE DESC31IPTI0N: SEQ ID NO: 35 

GTTAGTATGA GTGTCGTGCA GCCTCCAGGA CCCCCCCTCC 40 

CGGGA6AGCC ATAGT6GTCT GCGGAACCGG TGAGTACACC 80 

G6AATT6CCA 6GACGACC66 GTCCTTTCTT 66ATCAACCC 120 

5 6CTCAAT6CC T66AGATTTG G6CAC6CCCC CGCAAGATCA 160 

CTAGCC6AGT A6T6TTGGGT CGCGAAAGGC CTTGT6GTAC 200 

T6CCT6ATAG 66T6CTT6C6 AGT6CCCCGG GA66TCTC6T 240 

A6ACCGT6CA CC 252 

10 (2) INFOKKATION FOR SEQ ID NO: 36 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252 nucleotides 

(B) TYPE: nucleic acid 

15 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

20 (vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: sp2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36 

GTTAGTATGA GTGTCGTGCA GCCTCCAGGA CCCCCCCTCC 40 

25 CGG6AGAGCC ATAGTGGTCT GCGGAACCGG TGAGTACACC 80 

GGAATTGCCA 6GACGACCGG GTCCTTTCTT GGATAAACCC 120 

GCTCAATGCC TG6AGATTTG GGCGTGCCCC C6CGAGACTG 160 
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CTAGCC6AGT AGTGTTGGGT CGCGAAAGGC CTTGTGGTAC 200 
TGCCTGATAG GGTGCTT6CG AGT6CCCCGG GAGGTCTCGT 240 
A6ACC6TGCA CC 252 

5 (2) INFORMATION FOR 6EQ ID NO: 37 

(i) SEQUENCE CHASACTERISTICS: 

(A) LENGTH: 252 nucleotides 

(B) TYPE: nucleic acid 

10 (C) STHANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

15 (vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: gin2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37 

GTTAGTAT6A GTGTC6T6CA 6CCTCCA6GA CCCCCCCTCC 40 

20 CGGGAGA6CC ATAGTGGTCT GCGGAACCGG TGAGTACACC 80 

66AATTGCCA 66ACGACCGG GTCCTTTCTT GGATCAACCC 120 

GCTCAATGCC TGGAGATTTG GGCGT6CCCC CGCAAGACTG 160 

CTAGCCGAGT AGTGTTGGGT CGCGAAAGGC CTTGTGGTAC 200 

TGCCTGATAG GGTGCTTGCG AGTGCCCCGG GAGGTCTCGT 240 

25 AGACCGTGCA CC 252 

(2) INFORMATION FOR SEQ ID NO: 38 



SUBSTITUTE SHEET 



wo 92/19743 



PCr/US92/04036 



- 84 - 



(i) SEQUENCE CHAHACTERISTICS: 

(A) LENGTH: 252 nucleotides 

(B) TYFE: nucleic acid 

(C) STRAlilDEDNESS: single 
5 (D) TOPOLOGY: linear 

(ii) NOLECDLE TSFE: DNA 

(Vi) ORIGINAL SOURCE: 
10 (C) IfiSIVIDnAL ISOLATE: 121 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38 

GTTAGTATGA GTGTCGTGCA GCCTCCAGGA CCCCCCCTCC 40 

CGGGAGAGCC ATAQTGGTCT GCGGAACCG6 TGAGTACACC 80 

15 GGAATTGCCA 6GACGACCGG GTCCTTTCTT GGATAAACCC 120 

GCTCAATGCC TGGAGATTTG GGCGTGCCCC CGCAAGACTG 160 

CTAGCC6AGT AGTGTTGGGT CGCGAAAGGC CTTGTGGTAC 200 

TGCCTGATAG GGTGCTTGCG AGTGCCCC6G 6AGGTCTCGT 240 

AGACCGTGCA CC 252 



20 



(2) INFORMATION FOR SEQ ID NO: 39 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252 nucleotides 
25 (6) TYPE: nucleic acid 

( C ) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: U64 

5 

(xi) SEQUENCE DESCRIPTION: 6EQ ID NO: 39 
6TTAGTATGA GTGTCGTGCA GCCTCCAGGA CCCCCCCTCC 40 
CGGGAGAGCC ATAGTGGTCT GCGGAACC6G TGAGTACACC 80 
OGAATTGCCA GGAC6ACCGG GTCCTTTCTT GQATCAAWC 120 
10 6CTCAATGCC TGGA6ATTTG GGC6TGCCCC CGCGAGACTG 160 

CTA6CCGAGT AGTGTTGGGT CGCGAAAG6C CTTGTGGTAC 200 
TGCCTGATAG GGTGCTTGCG AGTGCCCC6G 6AGGTCTC6T 240 
AGACCGTGCA CC 252 

15 (2) INFORMATION FOR SEQ ID NO: 40 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252 nucleotides 

(B) TYPE: nucleic acid 

20 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

25 (vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: jhl 
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(zi) SEQUENCE DESCRIPTION: SEQ ID NO: 40 

6TTAGTATGA 6TGTC6T6CA GCCTCCAGGA CCCCCCCTCC 40 

CGG6AGAGCC ATAGTGGTCT 6CG6AACCGG TGAGTACACC 80 

GGAATTGCCA GGAC6ACC6G GTCCTTTCTT GGATCAACCC 120 

5 GCTCAATGCC TGGAGATTTG G6CGTGCCCC CGCGAGACTG 160 

CTAGCCGAGT AGTGTTGGGT CGCGAAAGGC CTTGTGGTAC 200 

TGCCTGATAG GGTGCTTGCG AGTGCCCCGG GAGGTCTCGT 240 

AGACCGTGCA TC 252 

10 (2) INFOKMATION FOR SEQ ID NO: 41 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252 nucleotides 

(B) TYPE: nucleic acid 

15 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

20 (vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: nac5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41 

GTTAGTATGA 6TGTCGTGCA GCCTCCAGGA CCCCCCCTCC 40 

25 CGGGAGAGCC ATAGTGGTCT GCGGAACCGG TGAGTACACC 80 

GGAATTGCCA GGACGACCGG GTCCTTTCTT GGATCAACCC 120 

GCTCAATGCC TGGAGATTTG GGCGTGCCCC CGCGAGACTG 160 
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CTAGCCGAGT AGTGTTGGGT CGCGAAAGGC CTTGTGGTAC 200 
TGCCTGATA6 GGTGCTTGCG AGTGCCCC6G GAGGTCTCGT 240 
AGACCGTGCA CC 252 

5 (2) INFORMATION FOR SEQ ID NO: 42 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH.: 252 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
0 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: UNA 

(vi) ORIGINAL SOURCE: 
15 (C) INDIVIDUAL ISOLATE: arg2 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42 

GTTAGTATGA GTGTCGTGCA GCCTCCAGGA CCCCCCCTCC 40 

CGGGAGAGCC ATAGTGGTCT GCGGAACCGG TGAGTACACC 80 

20 GGAATT6CCA 66ACGACCGG GTCCTTTCTT GGATCAACCC 120 

6CTCAATGCC TG6AGATTTG GGCGTGCCCC CGCGAGACT6 160 

CTAGCCGAGT AGTGTTGGGT CGCGAAAGGC CTTGTGGTAC 200 

TGCCTGATA6 GGTGCTTGCG AGTGCCCCGG GAGGTCTCGT 240 

AGACCGTGCA CC 252 

25 

(2) INFORMATION FOR SEQ ID NO: 43 
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(i) SEQUENCE CHARACTERISTICS:- 

(A) LENGTH: 252 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
5 (O) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 
10 (C) INDIVIDUAL ISOLATE: spl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43 

6TTA6TAT6A 6TGTCGT6CA GCCTCCA6GA CCCCCCCTCC 40 

CGG6AGA6CC ATA6TG6TCT GC6GAACCG6 TGAGTACACC 80 

15 GGAATTGCCA GGACGACC6G GTCCTTTCTT GGATCAACCC 120 

GCTCAATGCC TGGAGATTTG GGCGTGCCCC CGCGAGACTG 160 

CTAGCCGAGT AGTGTTGG6T CGCGAAAGGC CTTGTGGTAC 200 

TGCCTGATAG GGTGCTTGC6 AGTGCCCCGG 6AGGTCTCGT 240 

AGACCGTGCA CC 252 



20 



(2) INFORMATION FOR SEQ ID NO: 44 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252 nucleotides 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(li) MOLECULE TYPE: OHA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: ghl 

5 

(aci) SEQUENCE DESCRIPTION: SEQ ID NO: 44 
GTTAOTATGA GTGTCGTGCA GCCTCCAQGA CCCCCCCTCC 
CG66AGAGCC ATAGT66TCT 6CG6AACCG6 T6AGTACACC 
6GAATT6CCA 6GAC6ACCG0 GTCCTTTCTT 6GATCAACCC 
10 GCTCAATGCC TQGAQATTTO GOCGTGCCCC CGCGAGACTG 

CTAGCCGAGT AGTGTT6GGT CGCGAAAGGC CTTGTGGTAC 
TGCCTGATAG GGTGCTTGCG AGTGCCCCG6 GAGGTCTCGT 
AGACCGTGCA CC 

15 (2) INFORMATION FOR SEQ ID NO: 45 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252 nucleotides 

(B) TYPE: nucleic acid 

(C) 5TRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

25 (vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: il5 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45 

GTTAGTATGA 6TGTCGTGCA GCCTCCAGGA CCOTCCCTCC 40 

CG66A6A6CC ATA6T6GTCT GC6GAACC66 TGA6TACACC 80 

GGAATTGCCA 6GACGACCGG 6TCCTTTCTT 6GATCAACCC 120 

5 6CTCAAT6CC T66AGATTT6 6GCGTGCCCC CGC6AGACT6 160 

CTAGCC6A6T AGTGTTG6GT CGC6AAA6GC CTTGTG6TAC 200 

T6CCTGATAG 6GTGCTTGCG AGTGCCCCGG GAG6TCTC6T. 240 

A6ACCGT6CA CO 252 

10 (2) INFOHMATION FOR SEQ ID NO: 46 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 252 nucleotides 

(B) TYPE: nucleic acid 

15 (C) STRANDEONESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

20 (vi> ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: ilO 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46 

GCTAGTATCA 6TGTC6TACA 6CCTCCA6GC CCCCCCCTCC 40 

CG66AGA6CC ATA6T6GTCT 6CGQAACCGG TGAGTACACC 80 

66AATTGCCG GQAAGACTG6 GTCCTTTCTT GGATAAACCC 120 

5 ACTCTATGCC CG6CCATTTG GGCGTGCCCC CGCAAGACTG 160 

CTAGCC6AGT AGCGTTGGGT TGCGAAAGGC CTTGTGGTAC 200 

T6CCTGATAG GGTGCTTGCG AGTGCCCCGG GAGGTCTCGT 240 

AGACCGTGCA TC 252 

10 (2) INFORMATION FOR SEQ ID NO: 47 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252 nucleotides 

(B) TYPE: nucleic acid 

15 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

20 (vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: arg6 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47 

GTTAGTATGA GTCTCGTACA GCCTCCAGGC CCCCCCCTCC 40 

25 CGGGAGAGCC ATAGTG6TCT GCGGAACCGG TGAGTACACC 80 

GGAATTGCTG GGAAGACTGG GTCCTTTCTT GGATAAACCC 120 

ACTCTATGCC CAGCCATTTG GGCGTGCCCC CGCAAGACTG 160 
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CTAGCCGAGT AGCGTTGGGT TGCGAAAGGC CTTGTGGTAC 200 
TGCCTGATAG 6GTGCTT6CG AGTGCCCCGG GAGGTCTC6T 240 
A6ACCGTGCA TC 252 

5 (2} INFOBHATION FOR SEQ ID NO: 48 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252 nucleotides 

(B) TYPE: nucleic acid 

(C) STRAMDEDNESS: single 
10 (D) T0P0L067: linear 

(ii) KOLECDLE TYPE: PNA 

(Vi) ORIGINAL SOURCE: 
15 (C) INDIVIDUAL ISOLATE: 621 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48 

GTTAGTACGA GTGTCGTGCA GCCTCCAGGA CTCCCCCTCC 40 

20 CGGGAGA6CC ATAGTGGTCT GCGGAACCG6 TGAGTACACC 80 

66AATC6CTG 6GGTGACCGG 6TCCTTTCTT 6GA6CAACCC 120 

6CTCAATACC CAGAAATTTG 6GCGTGCCCC C6CGAGATCA 160 

CTAGCCGAGT A6T6TTGGGT CGCGAAAGGC CTTGTGGTAC 200 

TGCCTGATAG 6GTGCTTGCG AGTGCCCCGG GAGGTCTCGT 240 

25 AGACCGI6CA AC 252 

(2) INFORMATION FOR SEQ ID NO: 49 
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(i) SEQDENCE CHARACTERISTICS: 

(A) LENGTH: 252 nucleotides 

(B) TYPE: nucleic acid 

5 (C) STRAMDEDNSSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DHA 

10 (vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: gj 61329 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49 

15 GTTAGTACGA GTGTCGTGCA GCCTCCAGGA CCCCCCCTCC 40 

CGG6AGA6CC ATAGT6GTCT 6CGGAACC66 TGAGTACACC 80 

GGAATCGCTG GGGTGACCGG GTCCTTTCTT G6AGTAACCC 120 

GCTCAATACC CAGAAATTT6 6GCGTGCCCC CGC0A6ATCA 160 

CTAGCCGAGT AGT6TTGGGT CGCGAAAGGC CTT6TGGTAC 200 

20 TGCCTGATAG GGTGCTTGCG AGTGCCCCGG GAGGTCTCGT 240 

AGACCGTGCA AC 252 

(2) INFORMATION FOR SEQ ID NO: 50 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 180 nucleotides 
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(B) TYPE: nucleic acid 

(C) STSANDEDKESS: single 
(9) TOPOLOGY: linear 

(ii) MOLECDLE TYPE: DNA 
(vi) ORIGINAL SOURCE: 

(C) IHSIVIDUM. ISOLATE: 8a3 

(Zi) SEQUENCE DESCRIPTION: SEQ IS NO: 50 



GTTAGTATGA GTGTCGAACA GCCTCCAGGA CCCCCCCTCC 40 

CGG6AGAGCC ATAGTGGTCT GCGGAACCGG TGAGTACACC 80 

GGAATTGCCG GGATGACCGG GTCCTTTCTT GGATAAACCC 120 

GCTCAATGCC CGGAGATTTG GGCGTGCCCC CGCGAGACTG 160 

15 CTAGCCGAGT AGTGTTGGGT 180 

(2) INFORMATION FOR SEQ ID NO: 51 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 180 nucleotides 

(5) . TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: DNA 

(Vi) ORIGINAL SOURCE: 
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(C) IMSIVIDDAL ISOLATE: sa4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51 
GTTAGTAT6A GT6TC6AACA GCCTCCA66A CCCCCCCTCC 40 
C66GA6AGCC ATAGT6GTCT 6CGGAACCGG T6AGTACACC 80 

GGAATTGCCG GGAT6ACCGG GTCCTTTCTT GGATAAACCC 120 
GCTCAATGCC C6GAGATTTG GGCGTGCCCC CGCGAGACTG 160 
CTAGCCGAGT AGTGTTGGGT 180 

(2) INFORMATION FOR SEQ ID NO: 52 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 549 nucleotides 
15 (B> TYPE: nucleic acid 

(C) STRANDEDNE8S: single 

(D) TOPOLOGY: linear 



20 



(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE! (ATCC # 40394) 
(C) INDIVIDUAL ISOLATE: hCVl 
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(xi) SEQI3EKCE DESCRIPTION: SEQ ID NO: 52 

ATGA6CAC6A ATCCTAAACC TCAAAAAAAA AACAAAC6TA 40 

ACACCAACC6 TC6CCCACA6 6ACGTCAA6T TCCC6G6T66 80 

C66TCA6ATC GTT66T66A6 TTTACTT6TT 6CCGC6CAG6 120 

5 GGCCCTAGAT TGGGTGT6CG CGCGAC6AGA AAGACTICCG 160 

AGCG6TCGCA ACCTCGAGGT AGACGTCAGC CTATCCCCAA 200 

G6CTCGTCGG CCCGAGG6CA GGACCTG6GC TCAGCCCGGG 240 

TACCCTTGGC CCCTCTATGG CAATGAG6GC TGCGGGTGGG 280 

CGG6ATGGCT CCTGTCTCCC CGTGGCTCTC GGCCTAGCTG 320 

10 GGGCCCCACA GACCCCCGGC GTAGGTCGCG CAATTTG6GT 360 

AAGGTCATCG ATACCCTTAC GTGCGGCTTC GCCGACCTCA 400 

TGGGGTACAT ACCGCTCGTC GGCGCCCCTC TTGGAGGCGC 440 

TGCCAGGGCC CTGGCGCATG GCGTCCGGGT TCTGGAAGAC 480 

GGCGTGAACT ATGCAACAGG GAACCTTCCT GGTTGCTCTT 520 

15 TCTCTATCTT CCTTCTGGCC CTGCTCTCT 549 

(2) INFOKHATION FOR SEQ ID NO: 53 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 549 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

25 (ii) KOLECULE TYPE: DNA 

(Vi) ORIGINAL SOURCE: 
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(C) IKDIVIDUAL ISOLATE: U85 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53 
ATGAGCACGA^ATCCTAAACC TCAAAGAAAA ACCAAACGTA 
ACACCAACCG TCGCCCACAG GACGTCAAGT TCCCGGGTG6 
CGOTCAGATC GTTGGTGGAG TTTACTTGTT 6CCGCGCAGG 
GGCCCTAGAT TGG6TGTGCG CGCGACGAG6 AAGACTTCCG 
AGCGGTC6CA ACCTC6A66T AGACGTCAGC CTATCCCCAA 
GGCGC6TC6G CCCGAS6GCA GGACCTG6GC TCA6CCC6GG 
TACCCTTGGC CCCTCTATGG CAATGAGGGT T6CG6GT6GG 
CGG6ATGGCT CCTGTCTCCC CGT6GCTCTC 66CCTAGTTG 
GGGCCCCACA GACCCCCGGC GTAGGTCGCG CAATTTGGGT 
AAGGTCATCG ATACCCTTAC GTGCGGCTTC GCCGACCACA 
TGGGGTACAT ACCGCTCGTC GGCGCCCCTC TT6GA6GCGC 
TGCCAGGGCT CTGGCGCATG GCGTCCGGGT TCTGGAAGAC 
GGCGTGAACT ATGCAACAGG GAACCTTCCT GGTTGCTCTT 
TCTCTATCTT CCTTCTGGCC CTGCTCTCT 

(2) INFORMATION FOR SEQ ID NO: 54 



40 
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(i> 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 549 nucleotides 

(B) TypE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



25 



(ii) 



MOLECULE TYPE: DNA 
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(vi) ORIGIHAL 60DRCE: 

(C) INDIVIDUAL ISOLATE: ausl 

5 (xi) SEQUENCE DESCRIPTION: SSQ ID NO: 54 

AT6A6CACGA ATCCTAAACC TCAAA6AAAA ACCAAAC6TA 40 

ACACCAACC6 TC6CCCACAG. 6AC6TTAA6T TCCC666T66 80 

C6GTCAGATC GTTGGTGGAG TTTACTTGTT GCCGCGCAGG 120 

GGCCCTAGAT TGGGTGTGCG CGCGACGAGG AAGACTTCCG 160 

10 AGCGGTCGCA ACCTCGA6GT AGACGTCAGC CTATCCCTAA 200 

GGCGCGTCG6 CCCGAGGGCA GGACCTGGGC TCAGCCCGGG 240 

TACCCCTGGC CCCTCTATGG TAATGA6GGT TGCGGATGGG 280 

CGG6ATGGCT CCTGTCCCCC CGTGGCTCTC 6GCCTAGTTG 320 

66GCCCTACA GACCCCCGGC GTAGGTC6CG CAATTTGGGT 360 

15 AA6GTCATC6 ATACCCTCAC 6T6CGGCTTC 6CCGACCACA 400 

TGGG6TACAT TCCGCTCGTT G6CGCCCCTC TT66G6GCGC 440 

T6CCAGGGCC CTG6C6CAT6 GCGTCCGG6T TCTGGAAGAC 480 

GGCGTGAACT ATGCAACAGG GAATCTTCCT GGTTGCTCTT 520 

TCTCTATCTT CCTTCTG6CC CTTCTCTCT 549 



20 



(2) INFORMATION FOR SEQ ID NO: 55 



'(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 549 nucleotides 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(S) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

5 (vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: sp2 
(aci) SEQUENCE DESCRIPTION: SEQ ID NO: 55 

ATGAGCACGA ATCCTAAACC TCAAAGAAAA ACCAAACGTA 40 

ACACCAACC6 TCGCCCACA6 6ACGTCAAGT TCCCGGGTGG 80 

10 CGGTCAGATC GTTGGTGGAG TTTACTTGTT GCCGCGCA6G 120 

G6CCCTA6AT T6GGTGTGC0 CAC6AC6A66 AAGACTTCC6 160 

AGCG6TCGCA ACCTCGA6GT AGACGTCAGC CCATCCCCAA 200 

GGCTCGTCGA CCC6AGGGCA GGACCTGG6C TCAGCCC6GG 240 

TACCCTTGGC CCCTCTATGG CAAT6AG6GC TGCGG6T6GG 280 

15 CGG6ATGGCT CCTGTCTCCC CGTG6CTCTC GGCCTAGCTG 320 

GGGCCCCACA GACCCCC66C GTAGGTC6CG CAATTTGGGT 360 

AAGGTCATCG ATACCCTTAC GTGCGGCTTC GCCGACCTCA 400 

TGGGGTACAT ACCGCTCGTC GGCGCCCCTC TTGGAGGCGC 440 

TGCCAGAGCC CTG6CGCATG GCGTCCGGGT TCTGGAAGAC 480 

20 G6CGTGAACT ATGCAACAGG GAACCTTCCC GGTTGCTCTT 520 
TCTCTATCTT CCTTCTGGCC CTGCTCTCT 



(2) INFORMATION FOR SEQ ID NO: 56 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 549 nucleotides 

(B) TYPE: nucleic acid 
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(C) STKANDSDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECDLE TSPE: DNA 



(vi) ORIGINAL SOURCE: 

(C) mSIVIDUAL ISOLATE: 



gin2 



10 



15 



20 



(xi) SEQUENCE DESCRIPTION: SEQ 
ATGA6CACGA ATCCTAAACC TCAAAGAAGA 
ACACCAACCG TCGCCCACAG 6ACGTCAAGT 
C6GTCA6ATC 6TT6GTG6AG TTTACTT6TT 
6GCCCTAGAT T6G6TGT6C6 CGC6AC6AG6 
A6CGGTCGCA ACCTCGA6GT AGACGTCAGC 
6GCACGTCGG CCCGAG6GTA GGACCTGG6C 
ZACCCTTG6C CCCTCTATGG CAATGAGGGT 
C6GGATG6CT CCT6TCTCCC C6CGGCTCTC 
GGGCCCCACA GACCCCCGGC 6TAGGTCGC6 
AAGGTCATCG ATACCCTTAC GTGCGGCTTC 
TGGGGTACAT ACCGCTCGTC GGCGCCCCTC 
TGCCAGGGCC CTGGCGCAT6 GCGTCCGGGT 
GGCGTGAACT ATGCAACAGG GAACCTTCCT 
TCTCTATCTT CCTTCTGGCC CTGCTCTCT 



ID NO: 56 

ACCAAACGTA 40 

TCCCGG6T66 ' 80 

GCC6CGCA6G 120 

AAGACTTCCG 160 

CTATCCCCAA 200 

TCA6CCCGGG 240 

TGC6G6TGGG 280 

6GCCTAACTG 320 

CAATTTGGGT 360 

GCCGACCTCA 400 

TTGGAGGCGC 440 

TCT6GAAGAC 480 

GGTTGCTCTT 520 
549 



25 (2) INFORMATION FOR SEQ 10 NO: 57 



(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 549 nucleotides 

(B) TOPE: nucleic acid 

(C) SIRANDEDNESS : single 

(D) TOPOLOGY: linear 

5 

(ii) MOLECULE TYPE: DNA 
(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: i21 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57 

ATGAQCACGA ATCCTAAACC TCAAA6AAAA ACCAAAC6TA 40 

ACACCAACCG TC6CCCACAG GACGTCAAGT TCCCGG6T6G 80 

CGGTCA6ATC GTTGGTGQAG TTTACTTOTT GCC6C6CAGG 120 

GGCCCTAGAT TGGGTGT0C6 CGC0AC6AGG AAGACTTCCG 160 

15 A6CGGTCGCA ACCTCGT6GT A6AC6CCAGC CTATCCCCAA 200 

GGCGCGTCGG CCCGAGGGCA GGACCTGGGC TCAGCCCGGG 240 

TACCCTTGGC CCCTCTATGG CAATGAGGGT TGCGGGTG6G 280 

CGGGATGGCT CCTGTCTCCC CGTGGCTCTC GGCCTAGCTG 320 

GGGCCCCACA GACCCCC6GC GTAGGTCGCG CAATTT6GGT 360 

20 AAGGTCATCG ATACCCTTAC GTGCGGCTTC 6CC6ACCTCA 400 

TG6GGTACAT ACCGCTCGTC GGCGCCCCTC TTGGAGGCGC 440 

. TGCCAGGGCC CTGGC6CAT6 GCGTCC6GGT TCTGGAAGAC 480 

GGCGT6AACT ATGCAACA6G 6AACCTTCCT GGTTGCTCTT 520 

TTTCTATTTT CCTTCTGGCC CTGCTCTCT 549 



25 



(2) INFORMATION FOR SEQ ID NO: 58 
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(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 549 nucleotides 

(B) TXPE: nucleic acid 

5 (C) STRANBEDNESS: single 

(D) TOPOLOGY: linear 
(ii) KOLECULE TYPE: SNA 

(Vi) ORIGINAL SOURCE: 
10 (C) INDIVIDUAL ISOLATE: us4 

(aci) SEQUENCE DESCRIPTION: SEQ ID NO: 58 

AT6AGCAC6A ATCCTAAACC TCAAA6AAAA ACCAAACGTA 40 

ACACCAACC6 CC6CCCACA6 6ACGTTAAGT TCCC66GC66 80 

15 TGGCCAGGTC GTTGGT6GA6 TTTACCT6TT 6CC6CGCA66 120 

GGCCCCAGGT TG6GTGTGCG C6C6ACTAGG AAGACTTCCG 160 

AGC6GTCGCA ACCTCGT6GA AGGCGACAAC CTATCCCCAA 200 

GGCTCGCCA6 CCCGAGGGCA GGGCCT6QGC TCAGCCCGGG 240 

TACCCTTGGC CCCTCTATGG CAATGAGGGT ATGG6GTGGG 280 

20 CAGGATGGCT CCTGTCACCC CGTGGCTCTC GGCCTAGTTG 320 

GGGCCCCACG GACCCCCGGC GTAGGTCGCG TAATTTGGGT 360 

AAGGTCATCG ATACCCTCAC ATGCGGCTTC GCCGACCTCA 400 

TGGG6TACAT TCCGCTCGTC 6GCGCCCCCC TTAGGGGCGC 440 

TGCCAGG6CC TTGGC6CAT6 6CGTCC6GGT TCTGGA6GAC 480 

25 G6CGTGAACT ACGCAACAGG 6AATCTGCCC GGTTGCTCCT 520 

TTTCTATCTT CCTCTTGGCT CTGCTGTCC 549 
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(2) INFOKIDITION FOR SEQ ID NO: 59 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 549 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(S) TOPOLOGY: linear 

(ii> MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: jhl 

' (Xi) SEQUEMCE DESCRIPTION: SEQ ID NO: 59 

15 ATGAGCACAA ATCCTAAACC TCAAAGAAAA ACCAAAC6TA 40 

ACACCAACC6 CCGCCCACAG 6ACGTCAAGT TCCCGGGCGG 80 

TGGTCAGATC GTTGGT6GAG TTTACCTGTt GCC6CGCA6G 120 

GGCCCCA6GT TGGGT6TGCG CGCGACTA6G AAGACTTCCG 160 

A6CGGTCGCA ACCTCGTGGA AGGCGACAAC CTATCCCCAA 200 

20 GGCTCGCCAG CCCGAGGOCA GGGCCTGGGC TCAGCCCGGG 240 

TACCCTTGGC CCCTCTATGG CAACGAGGGT ATGGGGTG6G 280 

CAGGATGGCT CCTGTCACCC CGTGGCTCTC GGCCTAGTTG 320 

GGGCCCCACG GACCCCCGGC GTAGGTC6CG TAATTTGGGT 360 

AAGGTCATCG ATACCCTCAC ATGCGGCTTC GCCGACCTCA 400 

25 TGGGGTACAT TCCGCTTGTC GGCGCCCCCC TAGGGGGCGC 440 

TGCCAGGGCC CTGGCACATG GTGTCCGGGT TCTGGA6GAC 480 

GGCGTGAACT ATGCAACAGG GAATTTGCCC GGTTGCTCTT 520 
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TCTCTATCIT CCTCIT6GCT CT6CT6TCC 549 
(2) INFOIiMATION FOR iSEQ ID NO: 60 

(i) SEQUENCE CHARACTERISTICS: 

(A) LE1T6TH: 549 nucleotides 

(B) ryPE: nucleic acid 

(C) STRAltDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: nac5 



(3Ci} SEQUENCE DESCRIPTION: SEQ ID NO: 60 

ATGAGCACAA ATCCTAAACC CCAAAGAAAA ACCAAACGTA 40 

ACACCAACCG TCGCCCACAG GACGTCAA6T TCCCGGGC6G 80 

TGGTCAGATC GTTGGTGGAG TTTACCTGTT GCCGCGCAGG 120 

20 GGCCCCAGGT TGGGTGTGCG CGCGACTAGG AAGACTTCCG 160 

A6CGGTCGCA ACCTCGTGGA A6GCGACAAC CTATCCCCAA 200 

GGCTCGCCGG CCCGAGGGCA GGTCCTGGGC TCAGCCCGGG 240 

TACCCTTGGC CCCTCTATGG CAACGAGGGT ATGGGGTGGG 280 

CAGGAT66CT CCT6TCACCC CGCGGCn^CCC GGCCTA6TTG 320 

25 G66CCCCAC6 6ACCCCCG6C GTAG6TCGCG TAATTTGGGT 360 

AAG6TCATCG ATACCCTCAC ATGCG6CTTC GCCGACCTCA 400 
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TGGGGTACAT TCCGCTCGTC 6GCGCCCCCC TAGGGGGCGC 440 
TGCCAGGGCC CTGGCACATG GTGTCCGGGT TCTG6AGGAC 480 
6GCGTGAACT ATGCAACAGG GAATTTGCCT GGTT6CTCTT 520 
TCTCTATCTT CCTCTT6GCT CTGCTGTCC 549 

5 

(2) INFOSMATION FOR SEQ ID NO: 61 

(i) SEQOEMCE CHARACTERISTICS: 

(A) LENGTH: 549 nucleotides 

(B) TY7E: nucleic acid 

^0 (C) STRANPEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

15 (vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: arg2 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: fil 

ATGAGCACGA ATCCTAAACC TCAAAGAAAA ACCAAACGTA 40 

20 ACACCAACCG CCGCCCACAG GACGTCAAGT TCCCGGGCGG 80 

TGGTCAGATC GTTGGTGGAG TTTACTTGTT GCC6CGCAGG 120 

GGCCCCAG6T T6G6TGTGCG CGCGACTAGG AAGACTTCCG 160 

AGCGGTCGCA ACCTC6TGGA AGGCGACAAC CTATCCCCAA 200 

G0CTCGCCA6 CCCGAGGGTA GGGCCTGGGC TCAGCCCGGG 240 

25 TACCCTTG6C CCCTCTATGG CAATGAGGGT ATG6GGTGGG 280 

CAGGGTGGCT CCTGTCCCCC CGCG6CTCCC GGCCTAGTTG 320 



SUBSTITUTE SHEET 



wo 92/19743 



PCr/US92/04036 



- 106 - 



GGGCCCCACA GACCCCCGGC GTA66TC6C6 TAATTT66GT 360 

AA66TCATCG ATACCCTCAC AT6C66CTTC GCCGACCTCA 400 

T6666TACAT TCCGCTC6TC GGCGCCCCCC TA6GGGGC6C 440 

TGCCAGGGCC CTG6CGCATG GCGTCCGGGT TCTGGA66AC 480 

5 GGC6TGAACT ATGCAACA6G GAATCTGCCC 6GTTGCTCTT 520 

TCTCTATCTT CCTCTTG6CT TTGCTGTCC 549 
(2) IKFOSKATION FOR SEQ ID NO: 62 

(i). . SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 549 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
(S) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: spl 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62 

AT6AGCAC6A ATCCTAAACC TCAAAGAAAA ACCAAACGTA 40 

ACACCAACCG CCGCCCACAG GACGTCAAGT TCCCGGGCGG 80 

TGGTCAGATC GTTGGTGGAG TTTACCTGTT GCCGCGCAGG 120 

GGCCCCAGGT TGGGTGTGCG CGCGACTAGG AAGACTTCCG 160 

25 AGCGGTCGCA ACCTCGTGGA AGGCGACAAC CTATCCCCAA 200 

GGCTC6CCGG CCCGAG6GCA GGGCCTGGGC TCAGCCCGGG 240 

TATCCTTGGC CCCTCTATGG CAATGAGGGT CTGGGGTGGG 280 
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CA6GAT6GCT CCT6TCACCC CGC06CTCTC 66CCTA6CT6 320 

66GCCCTACC GACCCCC6GC 6TA66TCGC6 CAACTTG6GT 360 

AAGGTCATCG ATACCCTTAC 6TGC0GCTTC 6CC6ACCTCA 400 

T6GGGTACAT TCC6CTCGTC G6CGCCCCCC TTA6G6GC6C 440 

5 T6CCAGG6CC CTG6CGCATG 6CGTCCG6GT TCTGGA6GAC 480 

GGCGTGAACT AT6CAACAGG GAATTTGCCC 66TTGCTCTT 520 

TCTCTATCTT CCTCTTGGCT TT6CTGTCC 549 



(2) INFOKKATION FOR SSQ IP NO: 63 



10 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 549 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 
20 (C) INDIVIDUAL ISOLATE: ghl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63 

ATGAGCACGA ATCCTAAACC TCAAAGAAAA ACCAAACGTA 40 

. ACACCAACCG CCGCCCACAG GACGTCAAGT TCCCGG6CGG 80 

25 TGGTCAGATC GTT6GTGGAG TTTACTTGTT GCCGCGCAGG 120 

GGCCCCAGGT TGGOTGTGCG CGCGACTAGG AA6ACTTCCG 160 

AGCGGTCGCA ACCTCGTGGA AGGCGACAAC CTATCCCCAA 200 



1 
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GGCTC6CC66 CCC6AG66CA 6GGCCT6G6C TCAGCCC666 240 

TACCCTTGGC CCCTCTAT6G CAATGAGGGT ATGGGGTGGG 280 

CAGGATGGCT CCTGTCACCC CGTGGTTCTC GGCCTAGTTG 320 

GGGCCCCACG 6ACCCCCGGC GTAGGTCGCG CAATTTG6GT 360 

AAGATCATCG ATACCCTCAC GTGCGGCTTC GCC6ACCTCA 400 

TGGGGTACAT TCC6CTCGTC GGCGCCCCCC TAGGG6GCGC 440 

TGCCAGGGCC CTGGCGCATG GCGTCCGGGT TCTGGAGGAC 480 

GGCGTGAACr ATGCAACAGG GAATCTGCCC 6GTTGCTCCT 520 

TTTCTATCTT CCTTCTGGCT TTGCTGTCC 549 



10 



(2) IMFOBNATION FOR SEQ ID NO: 64 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 549 nucleotides 
15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



20 



(ii) MOLECULE TYPE: SNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: il5 



ixi) SEQUENCE DESCRIPTION: SEQ ID NO: 64 

25 ATGAGCACGA ATCCTAAACC TCAAAGAAAA ACCAAACGTA 40 

ACACCAACCG CCGCCCACAG GACGTCAAGT TCCC6GGC6G 80 

TGGTCA6ATC GTTGGTGGAG TTTACCTGTT GCCGCGCAGG 120 
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66CCCCA0GT TGGGTGTGCG CGCGACTAGG AA6ACTTCCG 160 

A6C6GTCGCA ACCTCGTGGA AGGC6ACAAC CTATCCCCAA 200 

66CTC6CCAG CCCGA6GGCA GGGCCTGGGC TCAGCCCGGG 240 

TACCCCT6GC CCCTCTATGG CAATGAGGGT ATGGGGTGGG 280 

5 CA6GATQGCT CCTGTCACCC CGCGGCTCCC GGCCTAGTT6 320 

G6GCCCCAAA 6ACCCCCGGC GTAG6TCGCG TAATTTGGGT 360 

AA66TCATC6 ATACCCTCAC ATGC6GCTTC GCCGACCTCA 400 

T6G6GTACAT TCCGCTCGTC GGCGCCCCCT TAGQG60CGC 440 

TGCCAGGGCC CTGGCGCATG GCGTCCGGGT TCTQGAGGAC 480 

10 GGCGTGAACT ATGCAACA6G 6AATCTACCC GGTTGCTCTT 520 

TCTCTATCTT CCTCTTGGCT TTGCT6TCC 549 

(2) INFORMATION FOR SEQ ID HO: 65 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 549 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



20 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: ilO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65 
ATGAGCACAA ATCCTAAACC TCAAA6AAAA ACCAAAAGAA 40 



S 
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ACACTAACCG CCGCCCACA6 GACGTCAAGT TCCCGG6CGG 80 

TGGCCA6ATC GTTGGCGGAG TATACTTGCT GCCGCGCAGG 120 

G6CCCGAGAT TGGGT6TGCG CGCGACGAG6 AAAACTTCC6 160 

AACGATCCCA GCCACQCGGA AGGCGTCAGC CCATCCCTAA 200 

5 ASMCCGTCGC ACCGCTGGCA AGTCCTGGGG AAGGCCAGGA 240 

TATCCTT6GC CCCT6TATGG GAATGAGGGT CTCGGCTGGG 280 

CAG6GTG6CT CCTGTCCCCC CGT66CTCTC GCCCTTCATG 320 

GGQCCCCACT GACCCCC6GC ATA6ATCGC6 CAACTTGGGT 360 

AAGGTCATCG ATACCCTAAC GTGCGGTTTT GCCGACCTCA 400 

10 T6G6GTACAT TCCCGTCATC GGCGCCCCCG TTGGAGGCGT 440 

TGCCA6AGCT CTCGCCCACG GAGTGAGGGT TCTGGAGGAT 480 

GGGGTAAATT ATGCAACAGG GAATTTGCCC GGTTGCTCTT 520 

TCTCTATCTT TCTCTTAGCC CTCTTGTCT 549 

15 (2) INFORMATION FOR SEQ ID NO: 66 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 510 nucleotides 

(B) TYPE: nucleic acid 

20 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECDLE TYPE: DNA 

25 (vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: arg6 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66 

AT6AGCACAA ATCCTCAACC TCAAAGAAAA ACCAAAA6AA 40 

ACACTAACCG CCGCCCACA6 GACGTCAAGT TCCCGGGCGG 80 

TGGTCAOATC QTTGGCG6A6 TATACTTGTT 6CCGC6CAGG 120 

5 6GCCCCAGGT TGGGTGT6C6 C6CGACGAGG AAAACTTCCG 160 

AACGGTCCCA GCCACGTGGG A6GC6CCAGC CCATCCCCAA 200 

AGATCGGCGC .ACCACTGGCA .AGTCCTGGG6 GAA6CCAGGA 240 

TACCCTTGGC CCCTGTAT66 6AATGAGGGT CTCGGCTGGG 280 

CA6GGTGGCT CCTGTCCCCC CGCGGTTCTC GCCCTTCAT6 320 

10 GGGCCCCACT GACCCCCGGC ATAGATCACG CAACTTG66T 360 

AAGGTCATCG ATACCCTAAC GTGTGGTTTT GCCGACCTCA 400 

TGGGGTACAT TCCCGTCGGT GGTGCCCCCG TTG6TGGTGT 440 

CGCCAGAGCC CTTGCCCATG G6GTGAGGGT TCTGGAAGAC 480 

GGGATAAATT ATGCAACA6G 6AATCTGCCC 510 



15 



(2) INFOKMATION FOR SEQ ID NO: 67 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 nucleotides 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



25 



(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67 
CAAACGTAAC ACCAACCGRC GCCCACAGG 29 
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(2) INFORMATION FOR SEQ ID NO: 68 

(i) SEQUENCE CHARACTERISTICS: . 
5 (A) LENGTH: 24 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: . . single 

(D) TOPOLOGY: linear 

10 (ii) MOLECULE TYPE: DMA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68 
ACAGAYCCGC AKA6RTCCCC CACG 24 

15 (2) INFORMATION FOR SEQ ID NO: 69 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 nucleotides 

(B) TYPE: nucleic acid 

20 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69 

C6AACCTC6A 6GTAGAC6TC AGCCTATCCC 30 
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(2) INFORMATION FOR SEQ ID NO: 70 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 nucleotides 

(B) TYPE: nucleic acid 

(C) SlERANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: SNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70 
GCAACCTC6T GGAAGGCGAC AACCTATOCC 30 

(2) INFORMATION FOR SEQ ID NO: 71 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71 
25 GTCACCAATG ATTOCCCTAA CTCGAGTATT 30 

(2) INFORMATION FOR SEQ ID NO: 72 



A 
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(i) SEQUENCE CHAKACTERISTICS : 
(A) LENGTH: 26 nucleotides 
(6) TYPE: nucleic acid 

5 (C) STSAMSEDMESS: single 

(X)} TOPOLOGY: linear 

(ii) HOLECOLE TYPE: DNA 

10 (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72 

6TCACGAAC6 ACTGCTCCAA CTCAAG 26 

(2} INFOHKATION FOR SEQ ID NO: 73 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



20 
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(ii) MOLECULE TYPE: DMA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73 
TGGACATGAT CGCTGGWGCY CACTGGGG 28 

(2) INFORMATION FOR SEQ ID NO: 74 
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(i) SEQUENCE CHASACTERISTZCS: 

(A) LENGTH: 2B nucleotides 

(B) TXPE: nucleic acid 

S (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) HOLECDLE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTICaJ: SEQ ID NO: 74 
10 TGGAYATGGT GGYGGGGGCy CACTGGGG 28 

(2) INFORMATION FOR SEQ ID HO: 75 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 20 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75 
ATGAT6AACT GGTCVCCYAC 20 

25 (2) INFORMATION FOR SEQ ID NO: 76 ^ 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 26 nucleotides 

(B) OYPE: nucleic acid 

(C) STRANSEDNESS: single 

(D) TOPOLOGY: linear 

5 

(ii) KOLECDLE TXPE: DNA 

(ad) SEQUENCE DESCRIPTION: SEQ ID NO: 76 

ACCTTVGCCC AGTTSCCCRC CATGGA 26 

10 (2) INFORMATION FOR SEQ ID NO: 77 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 nucleotides 

(B) TYPE: nucleic acid 

15 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77 

AACCCACTCT ATGYCC6GYC AT 22 

(2) INFORMATION FOR SEQ ID NO: 78 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 nucleotides 

(B) TYPE: nucleic acid 
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(C) STRAMDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DMA 

(xi) SEQUENCE DESCRIPTION: SEQ ID HO: 78 
6AATC6CTG6 6GT6ACC6 

INFORMATION FOR SEQ ID NO: 79 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75 
CCATGAATCA CTCCCCTGTG AGGAACTA 

(2) INFORMATION FOR SEQ ID NO: 80 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 nucleotides 
CB) TYPE: nucleic acid 
(C) STRANDEDNESS: single 



(2) 
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(D) TOPOLOGY: linear 

(ii) KOLECULE TYPE: SNA 

5 (si) SEQUENCE DESCRIPTION: SEQ ID NO: 80 

TTGCG6G66C AC6CCCAA 18 
(2) INFORMATION FOR SEQ ID NO: 81 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 (ii) KOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81 
Y6AAGCGG6C ACA6TCARRC AAGARA0CA6 66C 33 

20 

(2) INFORMATION FOR SEQ ID NO: 82 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(S) TOPOLOGY: linear 
(li) MOLECULE TYPE: ONA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID MO: 82 
RTA3?AGCCCy GWGGAGTT6C GCACTTGGTR 66C 
INFOHHATION FOR SEQ ID NO: 83 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83 
RATACTC6AG TTAGGGCAAT CATTGGTGAC RTG 

20 (2) INFORMATION FOR SEQ ID NO: 84 



5 

(2) 



(i) 

10 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: -linear 
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(ii) HOLECOLE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84 
A6YRT6CA66 AT6GYATCRK BC6YCTCGIA CAC 33 

5 

(2) INFOKKTITION FOR SEQ ID NO: 85 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TX7E: nucleic acid 

10 (C) 5TRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85 

6TTRCCCTCR CGAACGCAAG GGACRCACCC CGG 33 

(2) INFORMATION FOR SEQ ID NO: 86 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) 6TRANDEDNESS: single 
(S) TOPOLOGY: linear 



25 



(ii) MOLECULE' TYPE: SNA 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID HO: 86 
CGTRGGGGTY AYCGCCACCC AACACCTCGA <3RC 

(2) INPORKATION FOR SEQ ID NO: 87 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TXFE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87 
CGTYGY6GGG AGTTTGCCRT CCCTGGTGGC YAC 

INFORMATION FOR SEQ ID NO: 88 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TXPE: DNA 
<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88 



<i) 



10 



15 

(2) 



1 



SUBSTITUTE SHECT 



wo 92/19743 



PCr/US92/04036 



10 



15 



25 



- 122 - 

CCC6ACAAGC AGATCGATGT GACGTCGAAG CTG 33 
(2) IKFOKKATION FOR SEQ ID NO: 89 

(i) SEQUENCE CHASACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(S) TOPOLOG?: linear 

(ii) MOLECULE TYPE: DMA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89 
CCCCACGTAG ARGGCCGARC AGA6RGTGGC GCY 33 

(2) INFORIIATION FOR SEQ ID NO: 90 



(i) SEQUENCE CHARACTERISTICS: 

(A> LENGTH: 33 nucleotides 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(D> TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA 

ixi) SEQUENCE DESCRIPTION: SEQ ID NO: 90 
YTGRCC6ACA AGAAAGACAG ACCCGCAYAR GTC 33 
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(2) INFORMATION FOR SEQ ID NO: 91 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

10 (ii) HOLECDLE TYPE: pNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91 
CGTCCA6TGG YGCCT666A6 AGAA66TGAA CA6 33 

15 (2) INFORMATION FOR SEQ ID NO: 92 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B> TYPE: nucleic acid 

20 (C) STRANDEDNESS: single. 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DHA 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92 

GCCGGGATAG ATRGARCAAT TGCARYCTTG CGT 33 
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(2) IlIFOmiATION FOR SEQ ID NO: 93 

(i) SEQUESCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRAHDED13ESS: single 
(O) T0P0I.067: linear 

(ii) MOLECULE TYPE: S19A 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93 
CATATCCCAT 6CCAT6CG6T 6ACCC6TTAY AT6 33 

(2) INFORMATION FOR SEQ ID NO: 94 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(aei) SEQUENCE DESCRIPTION: SEQ ID NO: 94 
25 YACCAAY6CC GTCGTA6GGG ACCARTTCAT CAT 33 

(2) INFORMATION FOR SEQ ID NO: 95 
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(i) SEQUENCS OiASACTERISTICS: 

(A) LEt76TK: 33 nucleotides 
. (B) TYPE: nucleic acid 

5 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: SNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID HO: 95 
10 GATGGCTTGT 6GGATCCGGA GYASCTGAGC YAY 33 

(2) INFORMATION FOR SEQ ID NO: 96 

(i) SEQUENCE CHARACTERISTICS: 
15 (A)' LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96 
GACTCCCCAG T6RGCWCCAG CGATCATRTC CAW 33 

25 (2) INFORMATION FOR SEQ ID NO: 97 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

5 

(ii) KOLSCDLE TYPE: TSSk 

(3ci) SEQUENCE DESCRIPTION: SEQ ZD NO: 97 

CCCCACCAT6 GA6AAAIACG CTAT6CCC6C YAG 33 

10 (2) INFOSKATION FOR SEQ ID NO: 98 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

15 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98 

TAGYAGCAGY ACTACYARGA CCTTC6CCCA GTT 33 

(2) INFORKATION FOR SEQ ID NO: 99 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 
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(C) STKANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DMA 

(xi) SEQUENCE DESCRIPTION: SEQ ID 190: 99 
6ST6ACGT6R GTKTCY6C6T CHACGCC6GC RAA 33 

(2) IMFORKATION FOR SEQ ID NO: 100 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100 
20 GGAAGYTGGG ATGGTYARRC ARGASAGCAR AGC 33 

(2) INFORMATION FOR SEQ ID NO: 101 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 

(ii) HOLECDLE TYPE: DNA 

5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101 

GTAYAYYCC6 6ACRC6TTGC GCACTICRIA AGO 33 
(2) INFOHKATION FOR SEQ ID NO: 102 

(i> SEQIJENCE CHARACTERISTICS: 
10 (A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102 
AATRCTTGKG TTGGAGCART CGTTYGTGAC ATG 33 

20 (2) INFORMATION FOR SEQ ID NO: 103 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

25 (C) STRANDEDNESS: single 

(D) TOPOLOGY; linear 
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(ii) MOLECULE TYPE: DNA 

t 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103 
RGYRTGCAT6 ATCAYGTCCG VY6CCTCATA CAC 33 

(2) INFORMATION FOR SEQ ID NO: 104 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) ■ TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104 

RTTGTYYTCC CGRACGCARG 6CACGCACCC R6G 33 

(2) INFORMATION FOR SEQ ID NO: 105 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 

(ii) MOLECULE TYPE: DNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105 
C6TGG6R6TS A6C6CZ71CCC A6CARCG66A 6SW 33 

(2} INFORMATION FOR SEQ ID NO: 106 

5 

(i) SEQDENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TSPE: nucleic acid 

(C) STRANDEDNESS : single 
10 (D) TOPOIiOG?: linear 

(ii) MOLECULE TYPE: SNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106 
15 YGTRGTGGGG AYGCTGKHRT TCCTGGCCGC VAR 33 

(2} INFORMATION FOR SEQ ID NO: 107 

(i) SEQDENCE CHARACTERISTICS: 
20 (A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) HOLECDLE TYPE: DNA 

(xi) SEQDENCE DESCRIPTION: SEQ ID NO: 107 
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CCCRACGAGC AARTCGACRT GRC6TCGTAW TGT 33 
(2) INFORMATION FOR SEQ ID NO: 108 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TIPS: nucleic acid 

(C) iSTRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECDLE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108 
YCCCACGTAC ATAGCSGAIIS A6ARR6YAGC CGY 33 

(2) INFORMATION FOR SEQ ID NO: 109 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109 
CTGGGAGAYR AGRAAAACAG ATCCGCARAG RTC 33 
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(2) nSiFOTQiSkliaS FOR SEQ ID NO: 110 

SEQUE27CE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 
CO STRAllDEDlilESS : single 
(D) TOPOLOGY: linear 

10 (ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTIQET: SEQ ID NO: 110 
Y6TCTCRT6C CG6CCA6SB6 AGAAG6TGAA YAG 33 

15 (2> INFORHATION FOR SEQ ID NO: 111 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

20 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

25 (zi) SEQUENCE DESCRIPTION: SEQ ID NO: 111 

6CCGGGATA6 AKKGAGCART T6CAKTCCT6 YAC 33 



(i) 

5 
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(2) INFORMATION FOR SEQ ID NO: 112 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANSEDNBSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112 
CATATCCCAA GCCATRC6RT GGCCT6AYAC CTG 33 

(2) INFORMATION FOR SEQ ID NO: 113 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
20 (D) TOPOLOGY: linear 

. (ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION! SEQ ID NO: 113 
25 CACTAR66CT GYYGTRGGYG ACCAGTTCAT CAT 33 

(2) INFORMATION FOR SEQ ID NO: 114 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

5 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) KOLECOLE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114 
10 GACR6CTTGT 666ATCC66A 6TAACTGC6A YAC 33 

(2) INFORMATION FOR SEQ ID NO: 115 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: DNA 

ixi) SEQUENCE DESCRIPTION: SEQ ID NO: 115 
GACTCCCCAG TGRGCCCCC6 CCACCATRTC CAT 33 

25 (2) INFORMATION FOR SEQ ID NO: 116 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 33 nucleotides 
(5) TYPE: nucleic acid 

(C) STSAMDEDNESS: single 

(D) TOPOLOGY: linear 

5 

(ii) MOLECULE TYPE: DMA 

(xi) SEQUEUCE DESCRIPTION: SEQ ID NO: 116 

SCCCACCAT6 6A.WWA6TA66 CAA66CCC6C YAG 33 

10 (2) INFORMATION FOR SEQ ID NO: 117 

(i) SEQUENCE CHAKACTERISTICS : 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

15 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117 

GAGTA6CATC ACAATCAADA CCTTAGCCCA GTT 33 

(2) INFORMATION FOR SEQ ID NO: 118 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 
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(C) STRAKDEDNESS : single 
(S) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

5 

(xi) SEQUEliTCE DESCRIPTION: SEQ ID NO: 118 
YGWCRY8YR6 6TRTKCCC6T CAAC6CCG6C AAA 33 



(2} INFORKATION FOR SEQ ID NO: 119 

10 

(i) SEQDENCE CHARACTERISTICS: 

(A) . LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
15 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119 
20 TCCTCACAGG GGAGTGATTC ATG6TGGA6T GTC 33 

(2) INFORMATION FOR SEQ ID NO: 120 

(i) SEQI}ENCE CHARACTERISTICS: 
25 (A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

5 (zi) SEQUENCE DESCRIPTION: SEQ ID NO: 120 

AT0GCTA6AC 6CTTTCT6C6 TGAA0ACA6T A6T 33 
(2) INFOKMATIQN FOR SEQ ID NO: 121 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121 
6CCTGGAGGC TGCACGRCAC TCATACTAAC GCC 33 

20 (2) ^ INFORMATION FOR SEQ ID NO: 122 

(i) SEQUENCE CHARACTERISTICS: 

-(A) LENGTH: 33 nucleotides 
(6) TYPE: nucleic acid 
25 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) KOLECDLE TYPE: DNA 

(xi) SEQT3EMCa3 DESCRIPTION: SEQ ID NO: 122 
C6CAGACCAC TATGGCTCnr CC66GAG6G6 GG6 33 

(2) INFORMATION FOR SEQ ID NO: 123 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECOLE TYPE: DNA 

(xi) SEQtJENCE DESCRIPTION: SEQ ID NO: 123 
TCRTCCYGGC AATTCCGGTG TACTCACCGG TTC 33 

(2) INFORMATION FOR SEQ ID NO: 124 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124 
6CATI6A0C6 6GTTDATCCA A6AAA6GACC CGG 33 

(2) INFORMATION FOR SEQ ID NO: 125 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
10 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125 
15 A6CA6TCTY6 CGGG6GCACG CCCAARTCTC CAG 33 

(2) INFORMATION FOR SEQ ID NO:. 126 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126 



(i) 

20 
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ACAAGSCCTT TC6CGACCCA ACACTACTCO GCT 33 
(2) INFORMATION FOR SEQ ID NO: 127 



10 



(i) SEQUENCE CHARACTERISTICS: 

(A) lESGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127 
6666CACTCG CAAGCACCCT ATCA6GCA6T ACC 33 

15 

(2) INFORMATION FOR SEQ ID NO: 128 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA 

(3ci) SEQUENCE DESCRIPTION: SEQ ID NO: 128 
YGTGCTCATG RT6CACGGTC TACGA6ACCT CCC 

(2) INFORMATION FOR SEQ ID NO: 129 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 (ii) , MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129 
GTTACGTTTG KTTYTTYTTT GRGGTTTRGG AWT 

20 (2) INFORMATION FOR SEQ ID NO: 130 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(i) 

10 



(i) 
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(ii) KOLECOLE T[PE: SISA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130 
CGG6AACTTR AC6TCCTGTG GGCGRC6GTT G6T 33 

(2) INFOSKATION FOR SEQ ID NO: 131 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 
10 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(2)} TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA 

ixi) SEQUENCE DESCRIPTION: SEQ ID NO: 131 
CAR6TAAACT CCACCRACGA TCTGRCCRCC RCC 33 

(2) INFORMATION FOR SEQ ID NO: 132 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
25 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132 
RCGCACACCC AAYCTRG06C CCCTGC6CGG CAA 33 

5 (2) IKFORMATION FOR SEQ ID NO: 133 

(1) SEQUENCE CHARACTERISTICS: • 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

10 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133 
15 AGGTTGCGAC CGCTCG6AAG TCTTYCTRGT CGC 33 

(2) INFORMATION FOR SEQ ID NO: 134 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134 



SUBSTITUTE SHCCT 



wo 92/19743 



PCrAIS92/04036 



- 144 - 



RCGHRCCTT6 66GAIAG6CT GACGTCWACC TCG 33 
(2) INFOHMATION FOR SEQ ID NO: 135 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
(S) TOPOLOGY: linear 

10 

(ii) KOLECDLE TYPE: DNA 

(zi) SEQUENCE DESCRIPTION: SEQ IS NO: 135 

RCGHRCCTTG 6G6ATAG6TT GTCGCCWTCC ACG 33 

15 (2) INFORMATION FOR SEQ ID NO: 136 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 
20' (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: SNA 

25 Cxi) SEQUENCE SESCRIPTION: SEQ IS NO: 136 

YCCRGGCTGR GCCCAGRYCC TRCCCTCG6R YY6 33 
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(2} INFORMATION FOR 8EQ ID KO: 137 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137 
BSHRCCCTCR TTRCCRTA6A 6GGGCCADGG RTA 33 

(2) INFORMATION FOR SEQ ID NO: 138 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138 
25 GCCRCGG6GW GACAGGAGCC ATCCYGCCCA CCC 33 

(2) INFORMATION FOR SEQ ID NO: 139 
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(i) 8EQIJENCE OIASACTERISTICS : 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

5 (C) STKAHDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) KOLECDLE TXFE: DlilA 

-10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139 

CCGG6G6TCY 6TGG6GCCCC AYCTAGGCCG RGA 33 
(2) INFORKATION FOR SEQ ID NO: 140 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140 
ATCGATGACC TTACCCAART TRC6CGACCT RC6 33 

25 (2) INFORMATION FOR SEQ ID NO: 141 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 33 nucleotides 

(B) T!?FE: nucleic acid 

(C) STRANDEDMESS : single 
(0) TOPOLOGY:, linear 

5 

(ii) MOLECULE TYPE: DMA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141 
CCCCATGAGR TCG6CGAAGC CGCAYGTRAG GGT 33 

10 

(2) INFORMATION FOR SEQ ID NO: 142 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

15 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DMA 

20 (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142 

6CCYCCWARR GGGGCGCC6A CGAGCGGWAT RTA 33 

(2) INFORMATION FOR SEQ ID NO: 143 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 
(S) TOPOLOGY: linear 

(ii) MOLECDLE TYPE: UNA 
5 (xi) SEQI3ENCE SESCRIPTION: SEQ ID NO: 143 

AACCCGGACR CCRT6Y6CCA RGGCCCT66C A6C. 33 

(2) iwoEmrias for seq id no: 144 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



15 



20 



(ii) KOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144 
RTTCCCTGTT 6CATAGTTCA CGCCGTOTTC CAG 33 

(2) INFORMATION FOR SEQ ID NO: 145 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(li) KOLECDLE TXFE: DHA 

(3ci) SEQUENCE DESCRIPTION: SEQ ID NO: 145 
5 CAKRA6GAAG AKAGAGAAAG AGCAACCRGG VSKR 33 

(2) INFORMATION FOR SEQ ID NO: 146 

(i) SEQUENCE CHARACTERISTICS: 
10 .(A) LENGTH: 20 nucleotides 

(B) TTPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION.: SEQ ID NO: 146 
A6GCATAGGA CCCGTGTCTT 20 

20 (2) INFORMATION FOR SEQ ID NO: 147 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 nucleotides 

(B) TYPE: nucleic acid 

25 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 147 
CTTCTTTGGA GAAAGTGGT6 20 
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CIAIM5 

1. As a composition of matter, a non-natural ly 
occurring nucleic acid having a non-HCV-l nucleotide 

5 sequence of eight or more nucleotides corresponding to 
a nucleotide sequence within the hepatitis C virus 
genome. 

2. The composition of claim 1 wherein said nucleotide 
10 sequence corresponding to a non-HCV-l nucleotide 

sequence within the hepatitis C virus genome is 
selected from the regions consisting of the NS5 region, 
envelope 1 region, S'UT region, and the core region. 

15 3. The composition of claim 1 wherein said nucleotide 
sequence corresponding to a non-HCV-1 nucleotide 
sequence within the hepatitis C virus genome 
corresponds to a sequence in the NS5 region. 

20 4. The composition of claim 3 wherein said nucleotide 
sequence corresponding to a non-HCV-l sequence within 
the hepatitis C virus genome is selected from a 
sequence within sequences nximbered 2-22. 
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5. The composition ;of claim l wherein said nucleotide 
sequence corresponding to a non-HCV-l nucleotide 
sequence within the hepatitis C virus genome 
corresponds to a sequence in the envelope 1 region. 

6. The composition of claim 5 wherein said nucleotide 
sequence corresponding to a non-HCV-l sequence within 
the hepatitis C virus genome corresponds to a sequence 
within sequence numbers 24-32. 

7. The composition of claim 1 wherein at least one 
sequence corresponding to a non-HCV-l nucleotide 
sequence within the hepatitis C virus genome 
corresponds to a sequence in the 5*UT region. 

8. The composition of claim 7 wherein said nucleotide 
sequence corresponding to a non-HCV-l sequence within 
the hepatitis C virus genome corresponds to a sequence 
within sequences numbered 34-51. 

9. The composition of claim 1 wherein said nucleotide 
sequence corresponding to a. non-HCV-l nucleotide 
sequence within the hepatitis C virus genome 
corresponds to a sequence in the core region. 
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10. The composition of claim 9 wherein said nucleotide 
sequence corresponding to a non-HCV-1 sequence within 
the hepatitis C virus genome corresponds to a within 
sequences nunibered 53--66. 

5 

11. The composition o£ claim 1 wherein said 
non*naturally occurring nucleic acid has a nucleotide 
sequence corresponding to one or more genotypes of 
hepatitis C virus. 

12. The composition of claim 11 wherein said 
non-naturally occurring nucleic acid has a sequence 
corresponding to a sequence of a first genotype which 
first genotype is defined substantially by seqniences 

15 numbered 1-6 in the NS5 region, 23-25 in the envelope 1 
region, 33-38 in the 5*DT region, and 52-57 in the core 
region. 

13. The composition of claim 11 wherein said 

20 non-naturally occurring nucleic acid has a sequence 

corresponding to a sequence of a second genotype which 
second genotype is defined siibstantially by sec[uences 
numbered 7-12 in the NS5 region, 26-28 in the envelope 
1 region, 39-45 in the 5'UT region, and 58-64 in the 

25 core region. 
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14. The composition of claim 11 wherein said 
non-natural ly occurring nucleic acid has a sequence 
corresponding to a sequence of a third genotype which 
third genotype is defined suhstantially by sequences 
numbered 13-17 in the NS5 region, 32 in the envelope 1 
region, 46-47 in the 5'OT region and 65-66 in the core 
region. 

15. The composition of claim li wherein said 
non-naturally occurring nucleic acid has a sequence 
corresponding to a sequence of a fourth genotype which 
fourth genotype is defined substantially by sequences 
numbered 20-22 in the KS5 region, 29-31 in the envelope 
1 region and 48-49 in the 5*UT region. 

16. The composition of claim 11 wherein said 
non-naturally occurring nucleic acid has a sequence 
corresponding to a sequence of a fifth genotype which 
fifth genotype is defined substantially by sequences 
numbered 18-19 in the NS5 region and 50-51 in the 5'UT 
region. 

17. The composition of claim' l wherein said 
non-naturally occurring nucleic acid is capable of 
priming a reaction for the synthesis of nucleic acid to 
form a nucleic acid having a nucleotide sequence 
corresponding to hepatitis C virus. 
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18. The composition of claim 1 wherein said 
non-natnrally occurring nucleic acid has label means 
for detecting a hybridization product. 

5 19. The composition of claim 1 wherein said 

non-naturally occurring nucleic acid has support means 
for separating a hybridization product from solution. 

20. The composition of claim 1 wherein said 

10 non-naturally occurring nucleic acid prevents the 
transcription or translation of viral nucleic acid. 

21. A method of forming a hybridization product with a 
hepatitis C virus nucleic acid comprising the following 

15 steps: 

a. placing a non-naturally occurring nucleic 

acid having a nucleotide seguence of eight or 
more nucleotides corresponding to a non-HCV-l 
seguence in the hepatitis C viral genome into 

20 conditions in which hybridization conditions 

can be imposed said non-naturally occurring 
nucleic acid capable of forming a 
hybridization product with said hepatitis C 
virus nucleic acid under hybridization 

25 conditions; and 
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b. . imposing hybridization conditions to form a 
hybridization product in the presence of 
hepatitis C virus nucleic acid. 

5 22. The method of claim 21 wherein said nucleotide 
sequence corresponding to a non-HCV-1 sequence in the 
hepatitis C virus genome corresponds to a sequence 
within at least one of the regions consisting 
essentially of NS5 region, envelope 1 region, 5'UT 
10 region, and the core region. 

23. The method of claim 21 wherein said nucleotide 
sequence corresponds to a non-HCV-l sequence 
corresponds to a sequence within the NS5 region. 

15 

24. The method of claim 23 wherein said nucleotide 
sequence corresponds to a non-HCV-i sequence 
corresponds to a sequence within sequences nximbered 
2-22. 

20 

25. The method of claim 21 wherein said nucleotide 
sequence corresponds to a non-HCV-i sequence 
corresponds to a sequence within the envelope 1 region. 



1 
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26. The method of claim 25 wherein said nucleotide 
sequence corresponds to a non-HCV-l sequence is 
selected from a sequence within sequences numbered 
24-32. 

5 

27. The method of claim 21 wherein said nucleotide 
sequence corresponds to a non-HCV-l sequence 
corresponding to a sequence within the 5'UT region. 

10 28. The method of claim 27 wherein said nucleotide 
sequence corresponds to a non-HCV-l sequence selected 
from a sequence within sequences numbered 34-51. 

29. The method of claim 21 wherein said nucleotide 
15 sequence corresponds to a non-HCV-l sequence 

corresponding to a sequence within the core region. 

30. The method of claim 29 wherein said nucleotide 
setjuence corresponds to a non-HCV-l sequence selected 

20 from a sequence within sequences numbered 53-66. 

. 31. The method of claim 21 wherein said nucleotide 
sequence corresponds to a non-HCV-1 nucleotide sequence 
corresponding to one or more genotypes of hepatitis C 
25 virus . 
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32. The method of. claim 21 wherein said non-naturally 
occurring nucleic acid has a sequence corresponding to 
a seguence of a first genotype which first genotype is 
defined substantially by secpaences numbered 1-6 in the 
5 NS5 region, 23-25 in the envelope l region, 33-38 in 
the 5'DT region, and 52-57 in the core region. 

33.. The method of claim 21 wherein said non-naturally 
occurring nucleic acid has a sequence corresponding to 
10 a sequence of a second genotype which second genotype 
is defined substantially by sequences numbered 7-12 in 
the NS5 region, 26-28 in the envelope 1 region, 39-45 
in the 5'UT region, and 58-64 in the core region. 

15 34. The method of claim 21 wherein said non-naturally 
occurring nucleic acid has a sequence corresponding to 
a sequence of a third genotype which third genotype is 
defined substantially by sequences numbered 13-17 in 
the NS5 region, 32 in the envelope 1 region, 46-47 in 

20 the 5'UT region and 65-66 in the core region, 

35. The method of claim 21 wherein said non-naturally 
occurring nucleic acid has a sequence corresponding to 
a sequence of a fourth genotype which fourth genotype 
25 is defined substantially by sequences numbered 20-22 in 
the NS5 region, 29-31 in the envelope 1 region and 
48-49 in the 5'UT region. 
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36. The method of claim 21 wherein said non-natural ly 
occurring nucleic acid has a sequence corresponding to 
a sequence of a fifth genotype which fifth genotype is 
defined substantially by sequences niambered 18-19 in 

5 the NS5 region and 50-51 in the 5'nT region. 

37. The method of claim 21 wherein said hybridization 
product is capable of priming a reaction for the 
synthesis of nucleic acid. 

10 

38. The method of claim 21 wherein said non-naturally 
occurring nucleic acid has label means for detecting a 
hybridization product. 

15 39. The method of claim 21 wherein said non-naturally 
occurring nucleic acid has support means for separating 
the hybridization product from solution. 

40. The method of claim 21 wherein said non-naturally 
20 occurring nucleic acid prevents the transcription or 

translation of viral nucleic acid. 

41. As a composition of matter, a non-naturally 
occurring polypeptide corresponding to a non-HCV-l 

25 nucleotide sequence of nine or more nucleotides which 
sequence of nine or more nucleotides corresponds to a 
sequence within hepatitis C virus genomic sequences. 
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42. The composition of claim 41 wherein said noa-HCV-l 
sequence is selected from one of the regions consisting 
of NS5 region, envelope 1 region, and the core region. 

5 43. The composition of claim 41 wherein said non-HCV-l 
nucleotide sequence corresponds to a sequence in the 
NSS region. 

44. The composition of claim 43 wherein said non-HCV-l 
10 sequence is selected from a sequence within sequences 

numbered 2-22. 

45. The con^osition of claim 41 wherein said non-HCV-l 
sequence corresponds to a sequence in the envelope 1 

15 region. 

46. The composition of claim 45 wherein said non-HCV-l 
sequence is selected from a sequence within sequences 
numbered 24-32. 

20 

47. The composition of claim 41 wherein said non-HCV-l 
sequence corresponds to a sequence in the core region. 

48. The composition of claim 47 wherein said non-HCV-l 
25 sequence is selected from a sequence within sequences 

nmnbered 52-66. 



t 
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49. The composition of claim 41 wherein said non-HCV-1 
nucleotide sequence has a nucleotide sequence 
corresponding to one or more genotypes of hepatitis C 
virus . 

5 

50. The composition of claim 41 wherein said non-HCV-1 
nucleotide sequence has a sequence corresponding to a 
sequence of a first genotype which first genotype is 
defined svibstantially by sequences numbered 1-6 in the 

10 NS5 region, 23-25 in the envelope 1 region, and 52-57 
in the core region. 

51. The composition of claim 41 wherein said non-HCV-i 
nucleotide sequence has a sequence corresponding to a 

15 sequence of a second genotype which second genotype is 
defined substantially by sequences numbered 7-12 in the 
NS5 region, 26-28 in the envelope 1 region, and 58-64 
in the core region. 

20 52. The composition of claim 41 wherein said non-HCV-1 
nucleotide sequence has a sequence corresponding to a 
sequence of a third genotype which third genotype is 
defined substantially by sequences numbered 13-17 in 
the NS5 region, 32 in the envelope 1 region, and 65-66 

25 in the core region. 
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53. The composition of claim 41 wherein said non-HCV-l 
nucleotide sequence has a sequence corresponding to a 
sequence of a fourth genotype which fourth genotype is 
defined substantially by sequences nxxrobered 20-22 in 

5 the NS5 region, 29-31 in the envelope 1 region and 
48-49 in the 5'DT region. 

54. The conqposition of claim 41 wherein said non-KCV-l 
nucleotide sequence has a sequence corresponding to a 

10 sequence of a fifth genotype which fifth genotype is 
defined sxibstantially by sequences numbered 18-19 in 
the NS5 region and 50*51 in the 5'UT region. 

55. The composition of claim 41 wherein said 

15 polypeptide is capable of generating an immune reaction 
in a host. 

56- An antibody capable of selectively binding to the 
composition of claim 41. 

20 

57. A method of detecting one or more genotypes of 
hepatitis C virus cozrrprising the following steps: 

a) placing a non-naturally occurring nucleic acid 
having a nucleotide sequence of eight or more 
25' nucleotides corresponding to one or more genotypes 

of hepatitis C virus under conditions where 
hybridization conditions can be imposed. 
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b) imposing hybridization conditions to form a 
hybridization product in the presence of hepatitis 
C virus nucleic acid; and 

c) monitoring the non-naturally occurring nucleic 
5 acid for the formation of a hybridization product, 

which hybridization product is indicative of the 
presence of the genotype of hepatitis C virus. 

58. The method of claim 57 wherein said non-naturally 
10 occurring nucleic acid has a sequence corresponding to 

a sequence of a first genotype which first genotype is 
defined substantially by sequences nximbered 1-6 in the 
NS5 region, 23-25 in the envelope 1 region, 33-38 in 
the 5'UT region, and 52-57 in the core region. 

15 

59. The method of claim 57 wherein said non-naturally 
occurring nucleic acid has a sequence corresponding to 
a sequence of a second genotype which second genotype 
is defined substantially by sequences numbered 7-12 in 

20 the NS5 region, 26-28 in the envelope 1 region, 39-45 
in the 5'tJT region, and 58-64 in the core region. 
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60. The method of claim 57 wherein said non-naturally 
occurring nucleic acid has a sequence corresponding to 
a sequence of a third genotype which third genotype is 
defined sxibstantially by seguences ntmbered 13-17 in 

5 the NS5 region, 32 in the envelope 1 region, 46-47 in 
the 5'UT region and 65-66 in the core region. 

61. The method of claim 57 wherein said non-naturally 
occurring nucleic acid has a sequence corresponding to 

10 a sequence of a fourth genotype which fourth genotype 
is defined substantially by sequences numbered 20-22 in 
the NS5 region, 29-31 in the envelope 1 region and 
48-49 in the 5'UT region. 

15 62. The method of claim 57 wherein said non-naturally 
occurring nucleic acid has a sequence corresponding to 
a sequence of a fifth genotype which fifth genotype is 
defined substantially by sequences, numbered 18-19 in 
the NS5 region. 

20 

63. The method of claim 57 wherein said non-naturally 
occurring nucleic acid has a secpience corresponding to 
a sequence numbered 67-145. 
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64. The method of claim 57 wherein said non-natural ly 
occurring nucleic acid has a sequence corresponding to 
a sequence numbered 69, 71, 73 and 81-99 to identify 
Group I genotypes in the core and region of the HCV 

5 genome. 

65. The method of claim 57 wherein said non-naturally 
occurring nucleic acid has a sequence corresponding to 
a sequence numbered 70, 72, 70 and 100-118 to identify 

10 Group II genotypes in the core and envelope regions of 
the HCV genome. 

66. The method of claim 57 wherein said non-naturally 
occurring nucleic acid has a sequence corresponding to 

15 a sequence numbiered 77 to identify Group III genotypes 
in the 5* UT region of the HCV genome. 

67. ^e method of claim 57 wherein said non-naturally 
occurring nucleic acid has a sequence ntimbered 79 to 

20 identify Group IV genotypes in the 5 ' UT region of the 
HCV genome. 
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